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CCCCAACCTGCTCCATTGCTTGGGGGAGCGGTCCATGAGCGCTTGTCTCATCCCT 
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Line number 258 
>c0827 

TCACAGAGGCTCTGAGGCTACCACGAAGATGAACTCTCAGAAATGGGATTGTCA 
CCCTCGATGAGTTTCCAGTTCCCTCTCTGTTGTATGATGACACAAGAAGGTGAAG 
TGTTGCCTCTCTACAACTGGAAGAGGGAGA (SEQ ID NO: 2) 



< 
O 

m 
rn 

o 



Line number 260 
>cl083 

GTACTAGCGCTTACAGGTCTGTGTGCAGCCATGCCCAGCTTTCTAAGTGGGTGCT 

GGAATCTGACTTCAGGTTTCATGCrrGAGCAGCAAAGCCCTCTTACACAGAGCCA 

TTTCGACAGTTCTGTGACTTAGGTAGACTCACATCTGTCAGGCTAGAATTTCCAA 

AATTGAAAATGAATTCAAAGTGAAATGCTTGGGAAGTAAGTTAAAGATAGGCTA 

AATGGTTAACCCAGCAGTTAGGGTTGCTTTCTGCTCTTCCAGGGGACCTAAGTTT 

GGTATTTTCTAGCACCCACACTTGGTGGCTCAAGCCCTCTAACTCCAGCTGCAGG 

GGATAGGATGCCCTCTTCTAACCTCCACTGGTGAGAATATCTCACACACACACAC 

ACACACACACATGCTCACATACACAACCT (SEQ ID N0:3) 
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GLUCOSE TRANSPORT-RELATED GENES AND 

USES THEREOF 



TECHNICAL MELD 

This invention relates to molecular biology, cell biology, glucose transport, medicine, 
and type II diabetes. 

10 

STATEMENT AS TO FEDERALLY SPONSORED RESEARCH 
Work on this invention was supported in part with Funds from the Federal 
government. The government therefore has certain rights in the invention. 

BACKGROUND 

15 Insulin stimulates glucose transport in muscle and fat. One of the most critical 

pathways that insulin activates is the rapid uptake of glucose from the circulation in both 
muscle and adipose tissue. Most of insulin's effect on glucose uptake in these tissues is 
dependent on the insulin-sensitive glucose transporter. CLUT4 (reviewed in Czech and 
Corvcra, 1999, J. Biol. Chcm. 274:1865-1868. Martin ct uL 1999, Cell Biochcm. Biophys. 

20 30:89-1 13. ElmcndorFel aL 1999 Exp, Cell Res. 253:55-62). The mechanism of insulin 
action is impaired in diabetes, leading to less glucose transport into muscle and fat. This is 
thought to be a primary defect in type II diabetes. Potentiating insulin action has a beneficial 
effect on type II diabeles. This is believed to be the mechanism of action of the drug Rczuhu 
(troglita/orie). 

25 Type 1! diabetes mclliuis (non-insulin-dependent diabetes) is a group of disorders, 

characterized by hyperglycemia that can involve an impaired insulin secretory response lo 
glucose and insulin resistance. One effect observed in type II diabetes is a decreased 
effectiveness of insulin in stimulating glucose uptake by skeletal muscle. Type II diabetes 
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accounts For about 85-90% of ail diabetes cases. In' some cases of type II diabeies the 
underlying physiological defect appears to be multiFacloral. 



SUMMARY 

The invention is based on the discovery of hundreds of genes that arc preferentially 
expressed in cell types in which glucose transport is affected in type II diabetes, i.e., skeletal 
muscle and adipose tissue, as well as certain proteins expressed in glucose-transporting 
vesicles. Accordingly, the invention Features methods of identifying a gene whose 
expression is altered in a glucose transport-related disease or disorder such as type II 
diabetes. 

The invention includes a method of identifying a gene whose expression is altered in 
a glucose transport-related disorder. The method includes the steps of providing a nucleic 
acid array having 4 or more nucleic acids immobilized on a solid support, each nucleic acid 
having a sequence of 10 or more consecutive nucleotides within any one of the sequences 
listed in Figs. I, 2A-2R, 3A-3E, 6A-6E, 7A-7U, SA-SI, 9. I3A-I3C, and 14A-14G or a 
complement thereof; providing a reference nucleic acid sample prepared from a tissue of a 
normal, control mammal; contacting the array with the reference sample; detecting 
hybridization of the reference sample with nucleic acids in the array, to obtain a reference 
pattern of glucose transport-related gene expression; providing a test nucleic acid prepared 
from a tissue of a mammal having a glucose transport-related disorder; contacting the array 
with the test sample; detecting hybridization of the test nucleic acid with nucleic acids in the 
array, to obtain a test pattern of glucose transport-related gene expression; and comparing the 
reference pattern with the test pattern to detect a gene whose expression is altered in the test 
pattern relative to its expression in the reference pattern. Figs. 6A-6E, 7A-7U, 8A-8I, 9, 
13A-13C, and I4A-I4G provide GcnBank accession numbers. By accessing the sites 
indicated by the accession numbers, one in the art can obtain the nucleotide sequence and 
polypeptide sequence for the listed gene. In some embodiments, the array has 10 or more 
nucleic acids. In other embodiments, the array has 100 or more nucleic acids. In yet other 
embodiments, the array has not more than 100 nucleic acids, or not more than 300 nucleic 
acids. In certain embodiments of the invention, the sequence is 30 or more nucleotides in 
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length, The reference nucleic acid and the test nucleic acid can be cDNAs, that are, in some 
embodiments, Fluorescently labeled. 

The invention includes a nucleic acid array having 4 or more nucleic acids 
immobilized on a solid support, each nucleic acid having a sequence of 10 or more 
5 consecutive nucleotides within any one of sequences listed in Figs. L 2A-2R. 3A-3E, 6A-6E. 
7A-7U. 8A-8K9, I3A-13C, and 14A-14G. In some embodiments, the array has 1.00 or more 
nucleic acids. In other embodiments, the array has not more than 100 nucleic acids, not more 
than 200 nucleic acids, or not more than 300 nucleic acids. 

One aspect of the invention is an isolated nucleic acid molecule having a nucleotide 
10 sequence from any one of SEQ ID NOS: 1-3, or a complement thereof. In some 

embodiments of the invention, the isolated nucleic acid sequence has a non-nucleic acid 
modifying group bound to either a 3' or 5 1 end of the nucleotide sequence or both; or a 
synthetic nucleic acid sequence bound to a 3' or 5' end of the nucleic acid sequence or both. 
The invention also includes an isolated polypeptide having an amino acid sequence 
15 encoded by a nucleic acid sequence from any one of SEQ ID NOS: 1-3. 

Another embodiment of the invention is an isolated nucleic acid molecule having a 
nucleic acid sequence from any one of SEQ ID NOS:4-93, or a complement thereof. In 
certain embodiments, the nucleotide sequence has a non-nucleic acid modifying group bound 
to either a 3' or 5' end of the nucleotide sequence or both; or a synthetic nucleic acid 
20 sequence bound to a 3' or 5 1 end of the nucleic acid sequence or both. The invention 

includes an isolated nucleic acid molecule having a nucleic acid sequence selected from SEQ 
ID NOS:4-93, or a complement thereof. The invention also includes an isolated polypeptide 
having an amino acid sequence encoded by a nucleic acid sequence selected from any one of 
SEQ ID NOS:4-93. 

25 In one aspect, the invention is method for identifying a candidate agent, that 

modulates the expression or activity of a glucose transport-related polypeptide. The method 
includes the steps of providing a sample containing a glucose transport-related polypeptide; 
adding a lest agent to the sample; assaying the sample for expression or activity of the 
glucose transport-related polypeptide; and comparing the effect of the test agent on 

30 expression or activity of the glucose transport-related polypeptide relative to a control. A 
change in glucose transport-related polypeptide expression or activity indicates that the test 
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agent is a candidate agent that can modulate expression or activity of the glucose transport- 
related polypeptide. In some aspects of the method the test agent is a polynucleotide, a 
polypeptide, a small non-nucleic acid organic molecule, a small inorganic molecule, an 
antibody, an antisense oligonucleotide, or a ribozyme. In yet another embodiment, the 
5 glucose transport-related polypeptide is assayed using an antibody. In some embodiments of 
the invention, the glucose transport-related polypeptide is a human glucose transport-related 
polypeptide. The method can include the additional step of determining whether glucose 
transport is modulated in the presence of the test agent. The test agent can decrease or 
increase glucose transport. The assay can be a cell based assay or a cell-free assay. In 

10 certain embodiments of the invention, the glucose transport-related polypeptide is selected 
from the group of polypeptides encoded by sequences having the nucleic acid sequences 
listed in Figs. I, 2A-2R, and 3A-3E, and the polypeptides listed in Figs. 6A-6E, 7A-7U, 8A- 
81, 9, 13A-13C, and 14A-14G6-9. 

Modulation of expression (nucleic acid or polypeptide) or activity can be an increase 

is or a decrease in expression or activity compared to a reference. The amount of modulation is 
generally at least two fold (i.e., a two fold increase or decrease in expression or activity) 
compared to a reference or a control sample. For example, the amount of modulation can be 
five 'old, ten fold, fifty fold, 100 fold, or more. 

The invention includes a method for identifying a candidate agent thai modulates 

20 expression of a glucose transport-related polynucleotide. The method includes the steps of 

providing a sample in which a glucose transport-related polynucleotide is expressed: adding a 
lest agent to the sample; detecting expression of the glucose transport-related polynucleotide: 
determining the amount of expression of the glucose transport-related polynucleotide: and 
comparing the effect of the test agent on the amount of expression of the glucose transport- 

?b related polynucleotide in the sample relative to a control, such that a change in the amount of 
expression from the glucose transport-related polynucleotide indicates the test agent is a 
candidate agent that can modulate expression of the glucose transport-related polynucleotide. 
The test agent can be a polynucleotide, a polypeptide, a small non-nucleic acid organic 
molecule, a small inorganic molecule, an antibody, an antisense oligonucleotide or a 

30 ribozyme. In some embodiments, the glucose transport-related polynucleotide is a human 
glucose transport-related polynucleotide. In another aspect of (he invention, the method 
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includes the siep of determining whether glucose transport is modulated (e.g., increased or 
decreased) in the presence of the test agent. In some embodiments, ihe glucose transport- 
related polynucleotide is selected from the group of sequences listed in Figs. 1, 2A-2R, and 
3A-3E-3 or a complement thereof, and listed in Figs, 6A-6E. 7A-7U, 8A-S1, 9, 13A-13C and 
5 I4A-I4G, or a complement thereof. The assay used in the method can be cell-based assay or 
a cell-free assay. 

The invention includes a method of diagnosing an individual having or at risk for a 
glucose transport-related disorder. The method includes the steps of providing a nucleic acid 
array having 4 or more nucleic acids immobilized on a solid support, each nucleic acid 

lo having a sequence of 10 or more nucleotides, the sequence having or containing a sequence 
selected from the group of the sequences listed in Figs. 1 , 2A-2R, and 3 A-3E, or a 
complement thereof, and the sequences of the genes listed in Figs. Figs. 6A-6E, 7A-7U. SA- 
Sf, 9, 13A-13C, and 14A-14G, or a complement thereof; providing a nucleic acid sample 
from the individual; contacting the array with the sample from the individual; detecting 

\s hybridization of nucleic acid in the sample from the individual with each nucleic acid in the 
array, to obtain a pattern of glucose transport-related gene expression; comparing the pattern 
of glucose transport-related gene expression in sample from the individual with a reference 
pattern, such that a comparison of the pattern of expression in the individual compared to the 
reference pattern indicates whether the individual has or is at risk for a glucose transporl- 

20 related disorder, in some aspects of the invention, the array has 10 or more nucleic acids; or 
100 or more nucleic acids. In other aspects of the invention, the array has not more than 100 
nucleic acids; not more than 200 nucleic acids, or not more than 300 nucleic acids. In some 
embodiments, the sequence has 30 or more nucleotides. The sample from the individual can 
be a cDNA sample, and the cDNA sample can be fluorescently labeled. In some 

?5 embodiments, the disorder is type II diabetes. 

The invention also includes a nucleic acid array having 4 or more nucleic acids 
immobilized on a solid support, each nucleic acid comprising a sequence of 10 or more 
nucleotides, the sequence consisting of at least a portion of a sequence selected from the 
sequences listed in Figs. 1 , 2A-2R, and 3A-3E, or a complement thereof. Figs. 6A-6E, 7A- 

30 7U, 8A-SL9, I3A-I3C, and 14A-I4G, or a complement thereof. 
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Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 
belongs. Although methods and materials similar or equivalent to those described herein can 
be used in the practice or testing of the present invention, suitable methods and materials are 
described below. All publications, patent applications, patents, and other references 
mentioned herein are incorporated by reference. In addition, the materials, methods, and 
examples are illustrative only and not intended to be limiting, 

Other features and advantages of the invention will be apparent from the detailed 
description, and from the claims. 

DESCRIPTION OF DRAWINGS 

Fig. 1 is a depiction of nucleic acid sequences identified in the Muscle Adipocyte 
Union library; c0143 (SEQ ID NO: I), c0827 (SEQ ID NO:2), and c 1083 (SEQ ID NO:3). 

Figs. 2A-2R are a series of sequences identified in the Muscle-Adipocyte Union 
Library (MAU library) that contain previously unidentified sequences and ESTs. 

Figs. 3A-3E are series of sequences identified in the Adipocyte Subtract! ve 
f subtractive) library that contain previously unidentified sequences and ESTs. 

Fig. 4 is a diagram showing a suppression subtractive hybridization protocol 

Fig. 5 is a diagram showing a protocol for constructing the Muscle-Adipocyte Union 

library. 

Figs. 6A-6E are a table showing genes expressed in the Adipocyte Subtractive 

Library. 

Figs. 7A-7U arc a table showing genes expressed in the Muscle-Adipocyte Union 

Library. 

Figs. 8A-SI are a table showing the proteins identified in peaks I and 2 of GLUT4- 

associatccl vesicles. 

Fig. 9 is a table listing those proteins/genes that are present in one or both of the 
subtractive and Musclc-Adipocyte-Union libraries and were also identified as proteins 
purified from Glut4 vesicles. u Yes" indicates that a peptide(s) corresponding to the protein 
was present in a preparation. "?" indicates that the protein has not yet been identified in this 
preparation but its presence has not been excluded. 
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Figs. 10A-10D are a series ofhydrophobicity plots of the c05S2 sequence. 

Figs. 1 1 A-1 1 D are a series oi : hydrophobieky plots of the c0l39 sequence. 

Figs. I2A-12D are a series ofhydrophobicity plots of the bO ! 75 sequence. 

Figs. I3A-13C are a table listing genes whose expression was not detected in 
5 fibroblasts, and was detected in adipocyte or muscle using GeneChips. Columns marked Fl 
and l : 2 are data from the fibroblast replicate chips, columns marked a 1 and a2 are data from 
the adipocyte replicate chips, and the columns marked ml and m2 are data from the muscle 
replicate chips. A indicates that the gene is absent in a tissue. P indicates that the gene is 
present in a tissue. An M indicates marginal signal and the software cannot determine if the 
10 gene is absent or present. 

Figs. 14A-14G are tables listing genes whose expression was determined to be the 
same on all fibroblast chips, and increased on both adipocyte or muscle GeneChips compared 
to a fibroblast chip. The columns marked fl, F2, and F3 are fibroblast replicate chips. The 
columns marked a 1 , a2, and a3 are adipocyte replicate chips, and the columns marked m 1 . 
i& m2, and m3 are the muscle replicate chips. NC indicates no change of expression. MI 
indicates that there was a moderate increase in expression. An I indicates an increase in 
expression. The function classes of the genes listed in the last column are as follows: Class I 
genes encode metabolic proteins; Class 2 genes encode signaling proteins. 

Figs. 15A-15B are a table listing highly expressed genes common between the 
20 Muscle-Adipocyte Union library and the Mu-74 GeneChips Arrays. 

DETAILED DESCRIPTION 
Library of Glucose Transport-Related Sequences 

Suppressive subtraction hybridization has been applied to create libraries (databases) 
25 of glucose transport-related nucleotide sequences. The Muscle-Adipocyte Union library 
contains about 230 glucose transport-related nucleotide sequences and was made by 
identifying nucleotide sequences selectively expressed in Tut and muscle tissue, but not in 
fibroblasts. Sequences from the subtractive library or the MAU library can be used in the 
invention. Generally, the sequences are from the MAU library. Unless indicated otherwise 
30 below, the library referred to is the MAU library. The sequences in the library represent 

glucose transport-related genes that are candidates for involvement in insulin-related action. 
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and thus potential drug targets For glucose transport-related disorders. Glucose transport- 
related disorders include diseases such as type II diabetes, obesity, certain types of" 
cardiovascular disease, and Syndrome X. 

The library can be used to construct DNA arrays for identifying glucose transport- 
related genes whose expression is altered (increased or decreased) in diseases or disorders 
characterized by insulin resistance, e.g.. type II diabetes, or defects in glucose transport. The 
library advantageously enables gene expression pattern comparisons that involve tens or 
hundreds of genes most likely to be involved in insulin resistance and type II diabetes, 
instead of comparisons that involve tens of thousands or hundreds of thousands of genes. 
This focus on a relatively small library advantageously simplifies data analysis and improves 
the signal-to-noise ratio. In addition to being useful for identifying individual glucose 
transport-related genes. DNA arrays of the invention can be used to identify gene expression 
patterns indicative of particular forms of type II diabetes or a predisposition (i.e., at risk for) 
for development of type IJ diabetes. The predisposition can be a genetic predisposition. 

Once specific glucose transport-related genes are identified using the library, assays 
for expression of individual genes can be employed. Specific assays can be employed, for 
example, in diagnostic methods to diagnose type II diabetes, methods for diagnosing 
particular forms of type II diabetes, and methods for identifying individuals who have pre- 
symptomatic forms of type II diabetes or a genetic predisposition for development of type II 
diabetes. Such diagnostic assays may provide useful information for devising therapeutic 
strategies tailored to individual patients. 

The library can also be used to assay expression of individual genes in animal (e.g., 
mouse) models of a disease in which glucose transport is affected. For example, cDNA can 
be prepared from RNA isolated from a mouse having a glucose transport-related disorder 
such as diabetes. The RNA can be isolated from a tissue that normally carries out glucose 
transport (e.g.. muscle or adipose tissue). The cDNA is hybridized to sequences from the 
MAU library. Expression of the MAU library sequences is then compared to expression of 
the sequences in a mouse that does not have the disorder. A relative increase or decrease in 
the expression of a sequence in the mouse having a glucose transport disorder compared to 
an unaffected mouse indicates that the sequence is involved in the disorder. Such sequences 
are useful, e.g.. for indicating genes or gene products as drug targets for treating the disorder. 
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Sequences in (he MAU library fall into three categories: ( I ) novel sequences (Fig. I ); 
(2) sequences from genes For which at least partial sequences were known, but For which no 
Function was known or predicted (Figs. 2A-2R and 3A-3E): and (3) sequences ol" genes with 
a known or predicted Function (included in Figs. 6A-6E and 7A-7U). The novel sequences 
5 are designated cOI48 (SEQ ID NO:l). c0827 (SEQ ID NO:2). and c 1083 (SEQ ID NO:3), 
and they are set Forth in Fig. I. 

Some oF the library sequences are a novel combination oF sequences based on partial 
sequencing oF genes thai were identi Tied in the Adipocyte Subtractive library as differentially 
expressed in adipocyte and fibroblast cells combined with overlapping sequences that were 

10 obtained From databanks (GenBank and TIGR (The Institute For Genomic Research)). 
Additional library sequences are novel combinations oF sequences based on partial 
sequencing oF genes that are identi Fied in the Muscle Adipocyte Union Library as 
differentially expressed in both adipocyte and muscle cells combined with overlapping 
sequences that were obtained From the databanks. Genes in these categories include bOI 1. 7 

15 (AAFT-like protein with CDP-alcohol phosphatidyltransFerases signature sequence; SEQ ID 
NO:S I ), bOI75 (GS2 protein; SEQ ID NO:S7), c0139 (endophilin-like protein coil-coil plus 
SH3 domain; SEQ ID NO: 12), c0250 (SEQ ID NO: 17), c0352 (SEQ ID NO: IS), c05S2 (Rab 
GTPase domain; SEQ ID NO:33), c0591 (isoForm oFTIG2 protein; SEQ ID NO:34), and 
c0840 (Clu-Iike protein; SEQ ID NO:53). These sequences are depicted in Figs. 2A-2R and 

?o 3A-3E, and are particularly useful in the methods of the invention. 

Sequences that are differentially expressed in adipocytes, muscle cells, or both (as 
compared to expression in, e.g., fibroblasts) are useful, e.g., as genes or providing gene 
products that arc targets for treatments For disorders involving glucose transport and For 
diagnosis of disorders involving aberrant glucose transport such as type II diabetes. 

25 

DNA Arrays 

DNAs containing complete or partial sequences from the library of glucose transport- 
re I alert sequences can be used to construct conventional DNA arrays (sometimes called DNA 
chips or gene chips). A DNA array according to the invention can contain tens, hundreds, or 
30 thousands of individual sequences immobilized (tethered) at discrete, predetermined 

locations (addresses or "spots") on a solid, planar support, e.g.. glass or nylon. Each spot 
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may contain more than one DNA molecule, but each DNA molecule at a given address has 
an identical nucleotide sequence. The DNA array can be a macroarray or microarray, the 
difference being in the size of the DNA spots. Macroarrays contain spots of about 300 
microns in diameter or larger and can be imaged using gel or blot scanners. Microarrays 
5 contain spots less than 300 microns, typically less than 200 microns, in diameter. 

For analysis and comparison of glucose transport-related gene expression patterns, an 
' array is constructed using sequences from at least four, e.g., at least 10, 20, 40. 60, SO or 100 
genes in the above-described library. A population of labeled cDNA representing total 
mRNA from a sample of a tissue of interest, e.g., muscle or adipose tissue, is contacted with 

10 the DNA array under suitable hybridization conditions. Hybridization of cDNAs with 

sequences in the array is detected, e.g., by fluorescence at particular addresses on the solid 
support. Thus, a pattern of fluorescence representing a gene expression pattern in the tissue 
of a particular individual or group of individuals is obtained. These patterns of glucose 
transport-related gene expression can be digitized and stored electronically for computerized 

15 analysis and comparison. For example, an array according to the invention can be used to 
compare glucose transport-related gene expression of type II diabetic individuals with each 
other, and with non-diabetic individuals. Such comparisons will reveal specific genes whose 
expression is increased or decreased in a given tissue type in individuals with type 11 diabetes 
or other glucose transport-related diseases or disorders. Such arrays can also be used to 

20 diagnose individuals having or at risk for a glucose transport-related disorder such as type I! 
diabetes. For example, a nucleic acid sample (e.g., cDNA) from an individual suspected of 
having a glucose transport-related disorder is prepared and hybridized to the array. The 
pattern (including the level) of expression of sequences in the sample is compared to a 
reference pattern (e.g., representing the pattern of expression in unaffected individuals, 

?h and/or representing the pattern of expression in individuals known to have a particular 

glucose transport-related disorder). A pattern of expression in the sample that varies from 
that of the unaffected reference, and/or corresponds with the pattern of expression in a 
glucose transport disorder indicates that the individual has a glucose transport disorder. 

In some embodiments of the invention, cDNAs are used to form the array. Suitable 

30 cDNAs can be obtained by conventional polymerase chain reaction (PCR) techniques. The 
length of the cDNAscan be from 20 to 2,000 nucleotides, e.g., from 100 to 1 ,000 
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nucleotides. Other methods known in the art for producing cDNAs can be used. For 
example, reverse transcription of a cloned sequence can be used (Tor example, as described in 
Sambrook el aL eds>. Molecular Cloning: A Laboratory Manual. 2nd ed. . Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press. Cold Spring Harbor, NY, 1989) 

The cDNAs are placed ("printed" or ''spotted") onto a suitable solid support 
(substrate), e.g., a coated glass microscope slide, at spec! Tic. predetermined locations 
(addresses) in a two-dimensional grid. A small volume, e.g., 5 nanoliters, oP a concentrated 
DNA solution is used in each spot. Spotting can be carried out using a commercial 
microspotting device (sometimes called an arraying machine or gridding robot) according to 
the vendor's instructions. Commercial vendors of solid supports and equipment for 
producing DNA arrays include BioRobotics Ltd., Cambridge, UK: Coming Science Products 
Division, Acton, MA; GENPAK Inc., Stony Brook, NY; SciMatrix, Inc., Durham, NC; and 
TeleChem Internationa], Sunnyvale, CA. 

The cDNAs can be attached to the solid support by any suitable method. In general, 
the linkage is covaient. Suitable methods of covalently linking DNA molecules to the solid 
support include amino cross-linking and UV crosslinking. For guidance concerning 
construction of cDNA arrays according to the invention, see, e.g., DeRisi et aL 1996, Nature 
Genetics 14:457-460; Khan et al., 1.999, Electrophoresis 20:223-229; Lockhart et aL 1996, 
Nature Bioteclmol. 14:1675-1680. 

In some embodiments of the invention, the immobilized DNAs in the array are ' 
synthetic oligonucleotides. Preformed oligonucleotides can be spotted to form a DNA array, 
using techniques described above with regard to cDNA. In general, however, the 
oligonucleotides are synthesized directly on the solid support. Methods Tor synthesizing 
oligonucleotide arrays are known in the art. See, e.g., Fodor et aL, U.S. Patent 
No. 5,744,305. The sequences oF the oligonucleotides represent portions of the sequences in 
the library described above. For example, the lengths of oligonucleotides arc 10 to 50 
nucleotides, e.g., 15, 20, 25. 30, 35, 40. or 45 nucleotides. 

In some embodiments of the invention, the human homologs of the identified 
sequences are used in the detection method. Examples of such human homologs arc listed 
with their GcnBank accession numbers in Figs. 6A-6E, 7A-7U, and SA-SI. In other 
embodiments, the sequence used for detection consists of highly conserved regions of the 
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sequence, e.g., sequence that is highly conserved between homologous mouse and human 
sequence. 



Sample Preparation and Analysis 

5 In methods of the invention, the transcription level of a glucose trans port- related gene 

is assumed to be reflected in the amount of its corresponding mRNA present in cells of 
assayed tissue or cell lines derived from specific tissues. In general, mRNA from the cells or 
tissue is copied into cDNA under conditions such that the relative amounts of cDNA 
produced representing specific genes reflect the relative amounts of the mRNA in the sample. 

!0 Comparative hybridization methods involve comparing the amounts of various, specific 
mRNAs in two tissue samples, as indicated by the amounts of corresponding cDNAs 
hybridized to sequences from the glucose transport-related gene library. 

The mRNA used to produce cDNA is generally isolated from other cellular contents 
and components. One useful approach for mRNA isolation is a two-step approach. In the 

15 first step, total RNA is isolated. The second step is based on hybridization of the poly(A) 
tails of mRNAs to oligo(dT) molecules bound to a solid support, e.g.. a chromatographic 
column or magnetic beads. Total RNA isolation and mRNA isolation are known in the art 
and can be accomplished, for example, using commercial kits according to the vendor's 
instructions. Similarly, synthesis of cDNA from isolated mRNA is known in the art and can 

20 be accomplished using commercial kits according to the vendor's instructions. Fluorescent 
labeling of cDNA can be achieved by including a fluorescently labeled deoxynucleotide, e.g., 
Cy5-dUTP or Cy3-dUTP, in the cDNA synthesis reaction. For guidance concerning isolation 
of mRNA and synthesis of fluorescently labeled cDNA for analysis on a DNA array, see, 
e.g., Ross et al., 2000, Nature Genetics 24:227-235. 

25 In the invention, conventional techniques for hybridization and washing of DNA 

arrays, detection of hybridization, and data analysis can be employed routinely without undue 
experimentation. Commercial vendors of hardware and software for scanning DNA arrays 
and analyzing data include Cartesian Technologies. Inc. (Irvine, CA); GSI Lumonics 
(Watcrlown, MA); Genetic Microsystems Inc. (Woburn, MA): and Scanalytics, Inc. (Fairfax, 

30 VA). 
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Isolated Nucleic Acid Molecules 

The invention provides certain novel, isolated nucleic acids that encode murine 
glucose transport-related polypeptides, or biologically active portions thereof (Fig. I ). In 
addition to Forming part of the library, these nucleic acids can be used as hybridization 
5 probes to identify the full-length genes that they represent, and to isolate related nucleic 

acids, e.g., murine nucleic acids can be used to identify and clone human homologs. These 
' nucleic acids also can be used to design PGR primers for PGR amplification of related 
nucleic acid molecules. The full-length genes identified and isolated using these novel 
sequences are predicted to function in insulin-responsive glucose transport systems in 

10 mammalian muscle cells and adipose cells. 

As used herein, "isolated DNA" means DNA that has been separated from DNA that 
flanks the DNA in the genome of the organism in which the DNA naturally occurs. The term 
therefore includes recombinant DNA incorporated into a vector, e.g., a cloning vector or an 
expression vector. The term also includes a molecule such as a cDNA, a genomic fragment, 

15 a fragment produced by PCR, or a restriction fragment. The term also includes a 

recombinant nucleotide sequence that is part of a hybrid gene construct, i.e., a construct 
encoding a fusion protein. The term excludes an isolated chromosome. Isolated nucleic 
acids of the invention (e.g., SEQ ID NOS:l-93) can include modifications at the 3' and/or 5' 
end of the molecule including a metal, a modified nucleotide residue, or a nucleotide 

20 sequence that is not contiguous with the sequence of interest in nature. Such modifications 
can also be made to the sequences or fragments of sequences used in the invention (e.g., 
sequences derived from the genes listed in Figs. 6-9 and 13-15). 

A full length coding sequence that contains a novel nucleotide sequence of the 
invention, e.g., a nucleic acid molecule containing a sequence set forth in Fig. 1 , or a 

25 complement thereof, can be isolated using conventional molecular biology techniques and 
the sequence information provided herein. For example the isolation can be accomplished 
without undue experimentation by applying techniques described in numerous treatises and 
reference manuals. For general guidance and specific protocols, see, e.g., Sambrook el ah, 
eds.. Molecular Cloning: A Laboratory Manual. 2nd ed, Cold Spring Harbor Laboratory. 

30 Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; Ausubel et al. (eels.). 
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1994, Current Protocols in Molecular Biology, John Wiley & Sons, Inc.; Innes et al. (eds.), 
1990, PCR Protocols, Academic Press. 

A nucleic acid molecule of the invention can be amplified using cDNA, mRNA, or 
genomic DNA as a template and appropriate oligonucleotide primers according to standard 
PCR amplification techniques. Once isolated, the full-length nucleic acid can be cloned into 
an appropriate vector and characterized by conventional DNA sequence analysis, using 
standard techniques and equipment. 

A nucleic acid fragment encoding a biologically active portion of a polypeptide 
encoded by a novel nucleic acid of the invention can be identified and prepared by isolating a 
portion of any of the sequences useful in the invention, expressing the encoded portion of the 
polypeptide protein (e.g., by recombinant expression in vitro) and assessing the activity of 
the encoded portion of the polypeptide. 

The invention further encompasses nucleic acid molecules that differ from the 
nucleotide sequence set forth in Fig. I, due to degeneracy of the genetic code and thus 
encode the same amino acid sequence as that encoded by the nucleotide sequence set forth in 
Fig. 1 . The invention further encompasses isolated nucleic acid molecules that hybridize 
with the sequences set forth in Fig. I under high stringency conditions. As used herein, 
"high stringency" means the following: hybridization at 42" C in the presence of 50% 
formamide; a first wash at 65° C with 2 x SSC containing 1% SDS: followed by a second 
wash at 65° C with 0.1 x SSC. 

In addition to the nucleotide sequences set forth in Fig. I, it will be appreciated by 
those skilled in the art that DNA sequence polymorphisms that lead to changes in the amino 
acid sequence may exist within a population (e.g., the human population). Such genetic 
polymorphisms may exist among individuals within a population due to natural allelic 
variation. An allele is one of a group of genes that occur alternatively at a given genetic 
locus. As used herein, "allelic variation" means variation in a nucleotide sequence that 
occurs at a given locus, or variation in an amino acid sequence of a polypeptide encoded by 
the nucleotide sequence at a given locus. Alternative alleles can be identified by sequencing 
the gene of interest in a number of different individuals. This can be accomplished by using 
hybridization probes to identify nucleic acids corresponding to the same genetic locus in a 
variety of individuals. The nucleic acid is then sequenced (e.g.. amplified using PCR and the 

- 14 - 
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PCR products are sequenced) to identify variations, isolated nucleic acids containing the 
nucleotide sequences of Fig. i that display allelic variations while retaining functional 
activity are within the scope of the invention. 

In some embodiments of the invention, changes arc introduced into the sequences of 
Fig. 1 by mutation thereby leading to changes in the amino acid sequence of the encoded 
protein, without altering the biological activity of the protein. For example, one can make 
nucleotide substitutions leading to amino acid substitutions at non-essential amino acid 
residues. A non-essential amino acid residue is a residue that can be altered from the wild- 
type sequence without altering the biological activity of the gene product (e.g., a protein). 
For example, amino acid residues that are not conserved or only semi-conserved among 
homologs of various species may be non-essential for activity and thus would be likely 
targets for alteration. In contrast, amino acid residues that are conserved among the 
homologs of various species (e.g., murine and human) may be necessary for activity and thus 
would not be likely targets for alteration. 

An isolated nucleic acid molecule encoding a variant protein can be created by 
introducing one or more nucleotide substitutions, additions, or deletions into the nucleotide 
sequence of c0l4S (SEQ ID NO: 1 ), c0827 (SEQ ID NO:2). and c 1 0S3 (SEQ ID NO:3) such 
that one or more amino acid substitutions, additions, or deletions are introduced into the 
encoded protein. Mutations can be introduced by standard techniques, such as site-directed 
mutagenesis and PCR-mediated mutagenesis. Preferably, conservative amino acid 
substitutions are made at one or more predicted non-essential amino acid residues. A 
"conservative amino acid substitution" is one in which the amino acid residue is replaced 
with an amino acid residue having a similar side chain. Families of amino acid residues 
having similar side chains have been defined in the art. These families include amino acids 
with basic side chains (e.g., lysine, argininc, histidine), acidic side chains (e.g.. asparlic acid, 
glutamic acid), uncharged polar side chains {e.g., glycine, asparaginc. glutaminc, serine, 
threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, 
valine, isoleucine) and aromatic side chains (e.g.. tyrosine, phenylalanine, tryptophan, 
histidine). Alternatively, mutations can be introduced randomly along ail or part of the 
coding sequence, such as by saturation mutagenesis, and the resultant mutants can be 
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screened For biological activity to identify mutants that retain activity. Following 
mutagenesis, the encoded protein can be expressed rccombinantly and the activity of the 
protein can be determined. 

5 Isolating Homologous Sequences from Other Species 

The human homologs of glucose-transport related genes and their products are useful 
for various embodiments of the present invention including diagnosis of glucose transport- 
related disorders such as type II diabetes. Homologs have already been identified for certain 
genes and GenBank Accession numbers are supplied for these. In those cases where a 
10 human homolog is not identified, several approaches can be used to identify such genes. 

These methods include low stringency hybridization screens of human libraries with a mouse 
glucose transport-related nucleic acid sequence, polymerase chain reactions (PCR) of human 
DNA sequence primed with degenerate oligonucleotides derived from a mouse glucose 
transport-related gene, two-hybrid screens, and database screens for homologous sequences. 

15 

Antisense Nucleic Acids 

The invention includes antisense nucleic acid molecules, i.e., nucleic acid molecules 
whose nucleotide sequence is complementary to all or part of an mRNA based on the 
sequences c0l4S, c0827, and cl 083 (Fig. I). An antisense nucleic acid molecule can be 
20 antisense to all or part of a non-coding region of the coding strand of a nucleotide sequence 
encoding a polypeptide of the invention. The non-coding regions ("5' and 3' untranslated 
regions") are the 5' and 3' sequences that flank the coding region and arc not translated into 
amino acids. 

An antisense oligonucleotide can be, for example, about 5. 10, 15, 20, 25, 30, 35, 40, 
25 45, or 50 nucleotides or more in length. An antisense nucleic acid of the invention can be 
constructed using chemical synthesis and enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) 
can be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
30 physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioatc derivatives and acridine substituted nucleotides can be used. Examples of 
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modi Pied nuclcolides which can be used to generate the antisense nucleic acid include 5- 
Fluorouracil, 5-bromouracil, 5-chIorouracil, 5-iodouracil, hypoxanthine, xanthine, 4- 
acetylcytosine. 5-(carboxyhydroxylmethyI) uracil, 5-carboxymethylaminomethyl-2- 
thiouridine, 5-carboxymethylaminomethyluraciL dihydrouracil, beta-D-galaclosylqueosinc. 
inosinc, N6-isopentenyIadenine, l-melhylguanine, l-melhylinosine, 2.2-dimethylguanine. 2- 
methyladenine, 2-methylguanine, 3-methyIcytosinc, 5-methylcytosine, N6-adenine, 7- 
methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracii, beta-D- 
mannosylqueosine, S'-methoxycarboxymethyluracil, 5-methoxyurucil. 2-methylthio-N6- 
isopentenyladenine, uracil-5-oxyacelic acid (v), wybutoxosine, pseudouracil, queosine, 2- 
thiocytosine, 5-methyl~2-thiouracil, 2-thiouracil t 4-thiouracil, 5-methyluracil, uracil-5- 
oxyacetic acid methylester, uracil-5-oxyacetic acid (v). 5-methyl-2-thiouracii, 3-(3-amino-3- 
N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the antisense 
nucleic acid can be produced biologically using an expression vector into which a nucleic 
acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted 
nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described 
further in the following subsection). 

The antisense nucleic acid molecules of the invention can be administered to a 
mammal, e.g., a human patient. Alternatively, they can be generated in sini such that they 
hybridize with or bind to cellular mRNA and/or genomic DNA encoding a selected 
polypeptide of the invention to thereby inhibit expression, e.g.. by inhibiting transcription 
and/or translation. The hybridization can be by conventional nucleotide complementarities to 
Form a stable duplex, or, for example, in the case of an antisense nucleic acid molecule which 
binds to DNA duplexes, through specific interactions in the major groove of the double helix. 
An example of a route of administration of antisense nucleic acid molecules of (he invention 
includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules can 
be modified to target selected cells and then administered systemically. For example, for 
systemic administration, antisense molecules can be modified such that they specifically bind 
to receptors or anligens expressed on a selected cell surface, e.g., by linking the antisense 
nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or 
antigens. The antisense nucleic acid molecules can also be delivered to cells using the 
vectors described herein. For example, to achieve sufficient intracellular concentrations of 
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the antisense molecules, vector constructs can be used in which the antisense nucleic acid 
molecule is placed under the control of a strong pol II or pol III promoter. 

An antisense nucleic acid molecule of" the invention can be an a-anomeric nucleic 
acid molecule. An a-anomeric nucleic acid molecule forms specific double-stranded hybrids 
5 with complementary RNA in which, contrary to the usual, (3-units, the strands run parallel to 
each other (Gaultier et al., 1987, Nucleic Acids Res. 15:6625-6641). The antisense nucleic 
acid molecule can also comprise a 2'-o-methyIribonucleotide (Inoue et al., 1987, Nucleic 
Acids Res. 15:6131-6148) or a chimeric RNA-DNA analog (Inoue et al., 1987, FEBS Lett. 
215:327-330). 

10 Antisense molecules that are complementary to all or part of a glucose transport- 

related gene are also useful for assaying expression of such genes using hybridization 
methods known in the art. For example, the antisense molecule is labeled (e.g., with a 
radioactive molecule) and an excess amount of the labeled antisense molecule is hybridized 
to an RNA sample. Unhybridized labeled antisense molecule is removed (e.g., by washing) 

15 and the amount of hybridized antisense molecule measured. The amount of hybridized 
molecule is measured and used to calculate the amount of expression of the glucose 
transport-related gene. In general, antisense molecules used for this purpose can hybridize to 
a sequence from a glucose transport-related gene under high stringency conditions such as 
those described herein. When the RNA sample is first used to synthesize cDNA, a sense 

20 molecule can be used. It is also possible to use a double-stranded molecule in such assays as 
long as the double-stranded molecule is adequately denatured prior to hybridization. 



Ribozymes 

The invention also encompasses ribozymes that have specificity for the sequences 
25 c0!4S, cOS27, and c 1083. Ribozymes are catalytic RNA molecules with ribonuclcase 

activity that arc capable of cleaving a single-stranded nucleic acid, such as an mRNA, to 
which they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes 
(described in Haselhoff and Gerlach, 19SS, Nature 334:585-59 ! )) can be used to catalytically 
cleave mRNA transcripts to thereby inhibit translation of the protein encoded by the mRNA. 
30 A ribozyme having specificity for a nucleic acid molecule of the invention can be designed 
based upon the nucleotide sequence of a cDNA disclosed herein. For example, a derivative 
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of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the 
active site is complementary to the nucleotide sequence to be cleaved in a glucose transport- 
related mRNA (Cech et al. U.S. Patent No. 4.9S7.07 1 : and Cech et aL, U.S. Patent 
No. 5,1 16,742). Alternatively, an mRNA encoding a polypeptide of the invention can be 
used to select a catalytic RNA having a specific ribonuclcase activity From a pool of RNA 
molecules. See, e.f>>, Bartel and Szostak, 1993, Science 261: 141 1-1418, 

The invention also encompasses nucleic acid molecules that form triple helical 
structures. For example, expression of a polypeptide of the invention can be inhibited by 
targeting nucleotide sequences complementary to the regulatory region of the gene encoding 
the polypeptide (e.g., the promoter and/or enhancer) to form triple helical structures that 
prevent transcription of the gene in target cells. See generally Helene, 1991, Anticancer 
Drug Des. 6(6):569-84; Helene, 1992, Ann. N.Y. Acad. Sci. 660:27-36; and Maher. 1992, 
Bioassays L4(12):807-15. 

In various embodiments, the nucleic acid molecules of the invention can be modified 
at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids [see Hyrup et 
a!., 1996, Bioorganic & Medicinal Chemistry 4(1 ): 5-23). Peptide nucleic acids (PNAs) are 
nucleic acid mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is 
replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. 
The neutral backbone of PNAs allows for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols, e.g.. as described in Hyrup ct al., 1996, 
supra: Perry-O'Keefe eta!., 1996, Proc. Natl. Acad. Sci. USA 93: 14670-675. 

PNAs can be used in therapeutic and diagnostic applications. For example, PNAs can 
be used as antisense or antigene agents for sequence-specific modulation of gene expression 
by, e.g.. inducing transcription or translation arrest or inhibiting replication. PNAs can also 
be used, e.g.. in the analysis of single base pair mutations in a gene by, e.g., PNA directed 
PCR clamping; as artificial restriction enzymes when used in combination with other 
enzymes, e.g., SI nucleases (Hyrup, 1996. supra: or as probes or primers for DNA sequence 
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and hybridization (Hyrup, 1996, supra: Perry-O'Keefe et a!., 1996, Pruc. Natl Acad. Sci. 
USA 93: 14670-675). 

PNAs can be modified, e.g., to enhance their stability or cellular uptake, by attaching 
lipophilic or other helper groups to PNA. by the formation of PNA-DNA chimeras, or by the 
5 use of liposomes or other techniques of drug delivery known in the art. For example, PNA- 
DNA chimeras can be generated which may combine the advantageous properties of PNA 
and DNA. Such chimeras allow DNA recognition enzymes, e.g.. RNAse H and DNA 
polymerases, to interact with the DNA portion while the PNA portion would provide high 
binding affinity and specificity. PNA-DNA chimeras can be linked using linkers of 

10 appropriate lengths selected in terms of base stacking, number of bonds between the 

nucleobases, and orientation (Hyrup, 1 996, supra). The synthesis of PNA-DNA chimeras can 
be performed as described in Hyrup, 1996, supra, and Finn et al., 1996, Nucleic Acids Res. 
24:3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry and modified nucleoside analogs. Compounds such as 

15 5'-(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite can be used as a link 

between the PNA and the 5' end of DNA (Mag et al.. 1989, Nucleic Acids Res. I7:5973-SS). 
PNA monomers are then coupled in a stepwise manner to produce a chimeric molecule with 
a 5" PNA segment and a 3' DNA segment (Finn et al., 1996, Nucleic Acids Res. 24:3357-63). 
Alternatively, chimeric molecules can be synthesized with a 5' DNA segment and a 3' PNA 

?0 segment (Peterseret al., 1975, Bioorganic Med. Chenu Lett. 5:1 1 19-1 I 124). 

In some embodiments, the oligonucleotide includes other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across 
the cell membrane (see, e.g., Letsinger et al., I9S9, Proc. Natl. Acad. Sci. USA 86:6553- 
6556; Lernaitre et al., 1987, Pwc. Nail. Acad Sci. USA S4:64S-652; PCT Publication No. 

25 WO 88/09810) or the blood-brain barrier [see. e.g.. PCT Publication No. WO 89/10134). In 
addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, 
e.g.. Krol et aL 1988, Bio/Techniques 6:958-976) or intercalating agents (see. e.g.. Zon, 
19SS, Pharm. Res. 5:539-549). To this end. the oligonucleotide may be conjugated to 
another molecule, e.g., a peptide, hybridization triggered cross-linking agent, transport agent, 

30 hybridization-triggered cleavage agent, etc. 
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Isolated Proteins 

The invention provides isolated polypeptides encoded by glucose transport-related 
nucleic acids depicted in Figs. 1 , 2A-2R, and 3A-3E. These polypeptides can be used, e.g.. 
as immunogens to raise antibodies. Methods are well known in the art for predicting the 
5 translation products of the nucleic acids (i.e, using computer programs that provide the 

predicted polypeptide sequences and direction as to which of the three reading frames is the 
open reading frame of the sequence. These polypeptide sequences can then be produced 
either biologically (e.g.. by positioning the nucleic acid sequence that encodes them in-frame 
in an expression vector transfected into a compatible expression system) or chemically using 

10 methods known in the art. The polypeptides encoded by the genes listed in Figs. 6-9 and 13- 
15 are also useful in the invention. For example, the entire polypeptide or a fragment thereof 
can be used to produce an antibody that is useful in a screening assay. Figs. 6-9 and 13-15, 
provide the GenBank accession numbers oF the sequences, when available. These listings 
provide both nucleotide and polypeptide sequences that are useful in the invention. 

15 An "isolated" or "purified 1 ' protein or biologically active portion thereof is 

substantially free of cellular material or other contaminating proteins from the cell or tissue 
source from which the protein is derived, or substantially free of chemical precursors or other 
chemicals when chemically synthesized. The language "substantially free of cellular 
material" includes preparations of protein in which the protein is separated from cellular 

20 components of the cells from which it is isolated or recombinant!)' produced. Thus, protein 
that is substantially free of cellular material includes preparations of protein having less than 
about 30%, 20%, 10%, or 5% (by dry weight) of heterologous protein (also referred to herein 
as "contaminating protein"). In general, when the protein or biologically active portion 
thereof is recombinant^ produced, it is also substantially free of culture medium, i.e., culture 

25 medium represents less than about 20%, 10%, or 5% of the volume of the protein 
preparation. In general, when the protein is produced by chemical synthesis, it is 
substantially free of chemical precursors or other chemicals, i.e., it is separated from 
chemical precursors or other chemicals that arc involved in the synthesis of the protein. 
Accordingly such preparations of the protein have less than about 30%, 20%, 10%, 5% (by 

30 dry weight) of chemical precursors or compounds other than the polypeptide of interest. 
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Expression of proteins and polypeptides can be assayed to determine the amount ot : 
expression. Methods for assaying protein expression are known in the art and include 
Western blot, immunoprecipitation, and radioimmunoassay. 

Biologically active portions of a polypeptide of the invention include polypeptides 
5 comprising amino acid sequences sufficiently identical to or derived from the amino acid 
sequence of the protein, which include fewer amino acids than the full length protein, and 
exhibit at least one activity of the corresponding full-length protein. Typically, biologically 
active portions comprise a domain or motif with at least one activity of the corresponding 
protein. A biologically active portion of a protein of the invention can be a polypeptide 

10 which is, for example, 10, 25, 50, 100, or more amino acids in length. Moreover, other 

biologically active portions, in which other regions of the protein are deleted, can be prepared 
by recombinant techniques and evaluated for one or more of the functional activities of the 
native form of a polypeptide of the invention. 

Polypeptides of the invention have the predicted amino acid sequence of an open 

15 reading frame of c0l48 (SEQ ID NO:l), c0827 (SEQ ID NO:2), and c 1083 (SEQ ID NO:3). 
In some embodiments, polypeptides of the invention have the predicted amino acid sequence 
selected from SEQ ID NOS:4-93. Other useful proteins are substantially identical (e.g., at 
least about 45%, preferably 55%, 65%, 75%, 85%, 95%, or 99%) to the predicted amino acid 
sequence of a polypeptide encoded by a polynucleotide comprising the polynucleotide 

20 sequence of c0148 (SEQ ID NO: I), c0827 (SEQ ID NO:2), and c 1083 (SEQ ID NO:3) or 
substantially identical (e.g., at least about 93%, preferably 94%, 95%, 96%, or 99%) to the 
predicted amino acid sequence of a polypeptide encoded by a polynucleotide comprising the 
polynucleotide sequence of c0I48 (SEQ ID NO:l),c0827 (SEQ ID NO:2), and c 1083 (SEQ 
ID NO:3), and retain the functional activity of the protein of the corresponding natural [y- 

25 occurring protein yet differ in amino acid sequence due to natural allelic variation or 
mutagenesis. 

The comparison of sequences and determination of percent identity between two 
sequences can be accomplished using a mathematical algorithm. In an embodiment of the 
invention, the percent identity between two amino acid sequences is determined using the 
30 Necdleman and Wunsch ((1970) L MoL Biol. 48:444-453 ) algorithm which has been 
incorporated into the GAP program in the GCG software package (available at 
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http://vvww.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap 
weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1. 2, 3, 4, 5, or 6. In another 
embodiment, the percent identity between two nucleotide sequences is determined using the 
GAP program in the GCG software package (available at http://www.gcg.com), using a 
5 NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or SO and a length weight of 1 , 
2, 3\ 4, 5, or 6. In general, percent identity between amino acid sequences referred to herein 
is determined using the BLAST 2.0 program, which is available to the public at 
http: //\vww, ncbi.nlm.nih.gov/BLAST . Sequence comparison is performed using an 
ungapped alignment and using the default parameters (Blossum 62 matrix, gap existence cost 
:o of 1 1. per residue gap cost of 1, and a lambda ratio of 0.85). The mathematical algorithm 
used in BLAST programs is described in Altschul et al„ 1997, Nucleic Acids Research 
25:3389-3402. 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "Fusion protein" comprises all or part (e.g., a biologically active portion) of a 

is polypeptide of the invention operably linked to a heterologous polypeptide (i.e., a 

polypeptide other than the same polypeptide of the invention). Within the fusion protein, the 
term "operably linked" is intended to indicate that the polypeptide oF the invention and the 
heterologous polypeptide are fused in-frame to each other. The heterologous polypeptide can 
be fused to the N-terminus or C-terminus of the polypeptide of the invention. 

20 One useful fusion protein is a GST fusion protein in which the polypeptide of the 

invention is Fused to the C-terminus of GST sequences. Such fusion proteins can facilitate 
the purification of a recombinant polypeptide of the invention. 

In another embodiment, the fusion protein contains a heterologous signal sequence at 
its N-tcrminus. For example, the native signal sequence of a polypeptide of the invention can 

25 be removed and replaced with a signal sequence from another protein. For example, the 

gp67 secretory sequence of the baculovirus envelope protein can be used as a heterologous 
signal sequence {Current Protocols in Molecular Biolof*\\ Ausubcl ct al.. cds.. John Wiley & 
Sons, 1992). Other examples of eukaryotic heterologous signal sequences include the 
secretory sequences of melittin and human placental alkaline phosphatase (Stratagene; La 

30 Jolla, California). In yet another example, useful prokaryotic heterologous signal sequences 
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include the phoA secretory signal (Sambrook ct ul.. supra) and the protein A secretory signal 
(Pharmacia Biotech; Piscataway, New Jersey). 

In yet another embodiment, the fusion protein is an immunoglobulin Fusion protein in 
which all or part of a polypeptide of the invention is Fused to sequences derived From a 
member of the immunoglobulin protein Family. The immunoglobulin Fusion proteins of the 
invention can be incorporated into pharmaceutical compositions and administered to a 
subject to inhibit an interaction between a ligand (soluble or membrane-bound) and a protein 
on the surface of a cell (receptor), to thereby suppress signal transduction in vivo. The 
immunoglobulin Fusion protein can be used to affect the bioavailability of a cognate ligand of 
a polypeptide of the invention. Inhibition of ligand/receptor interaction may be useful 
therapeutically, both for treating proliferative and differentiative disorders and for 
modulating (e.g., promoting or inhibiting) cell survival. Moreover, the immunoglobulin 
Fusion proteins of the invention can be used as immunogens to produce antibodies directed 
against a polypeptide of the invention in a subject, to purify ligands and in screening assays 
to identify molecules which inhibit the interaction oF receptors with ligands. 

Chimeric and fusion proteins of the invention can be produced by standard 
recombinant DNA techniques. In another embodiment, the fusion gene can be synthesized 
by conventional techniques including automated DNA synthesizers. Alternatively, PCR 
amplification oFgene Fragments can be carried out using anchor primers which give rise to 
complementary overhangs between two consecutive gene fragments which can subsequently 
be anneulcd and reamplified to generate a chimeric gene sequence [see, e.g., Ausube! et ul.. 
supra). Moreover, many expression vectors are commercially available that already encode a 
fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
in-frame to the polypeptide oFthe invention. 

A signal sequence of a polypeptide of the invention can be used to Facilitate secretion 
and isolation of the secreted protein or other proteins of interest. Signal sequences are 
typically characterized by a core oF hydrophobic amino acids which are generally cleaved 
from the mature protein during secretion in one or more cleavage events. Such signal 
peptides contain processing sites that allow cleavage oF the signal sequence From the mature 
proteins as they pass through the secretory pathway. Thus, the invention peruiins to the 
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described polypeptides having a signal sequence, as well as to the signal sequence itself and 
to the polypeptide in the absence of the signal sequence (i.e.. the cleavage products). In one 
embodiment, a nucleic acid sequence encoding a signal sequence of the invention can be 
operably linked in an expression vector to a protein of interest, such as a protein which is 
5 ordinarily not secreted or is otherwise difficult to isolate. The signal sequence directs 

secretion of the protein, such as from a eukaryotic host into which the expression vector is 
' transformed, and the signal sequence is subsequently or concurrently cleaved. The protein 
can then be readily purified from the extracellular medium by methods known in the art. 
Alternatively, the signal sequence can be linked to the protein of interest using a sequence 

10 which facilitates purification, such as with a GST domain. 

The present invention also pertains to variants of the polypeptides of the invention. 
Such variants have an altered amino acid sequence which can function as either agonists 
(mimetics) or as antagonists. Variants can be generated by mutagenesis, e.g., discrete point 
mutation or truncation. An agonist can retain substantially the same, or a subset, of the 

15 biological activities of the naturally occurring form of the protein. An antagonist of a protein 
can inhibit one or more of the activities of the naturally occurring form of the protein by, for 
example, competitively binding to a downstream or upstream member of a cellular signaling 
cascade which includes the protein of interest. Thus, specific biological effects can be 
elicited by treatment with a variant of limited function. Treatment of a subject with a variant 

20 having a subset of the biological activities of the naturally occurring form of the protein can 
have fewer side effects in a subject relative to treatment with the naturally occurring form of 
the protein. 

Antibodies 

25 An isolated polypeptide of the invention, or a fragment thereof, can be used as an 

immunogen to generate antibodies using standard techniques for polyclonal and monoclonal 
antibody preparation. The full-length polypeptide or protein can be used or, alternatively, the 
invention provides antigenic peptide fragments for use as immunogens. The antigenic 
peptide of a protein of the invention comprises at least 8 (e.g., 10, 15, 20, or 30) amino acid 

30 residues of the amino acid sequence of a sequence of the invention, e.g., cO 148, cOS27, and 
c I0S3, and encompasses an epitope of the protein such that an antibody raised against the 
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peptide forms a specific immune complex with the protein. Sequences also useful in the 
invention include polypeptides encoded by the sequences in Figs. I, 2A-2R, and 3A-3E or 
polypeptides encoded by sequences comprising a sequence listed in Figs. 1, 2A-2R, and 3A- 
3R. Polypeptides encoded by the known genes identified herein as glucose transport-related 
5 genes are also useful in the invention. 

Epitopes can be encompassed by the antigenic peptide are regions that are located on 
the surface of the protein, e.g., hydrophilic regions. Kydrophilic regions of selected 
sequences are indicated in hydrophobicity plots (Figs. 10A-10D, 1 1 A- 1 ID, and 12A-12D). 
These plots or similar analyses can be used to identify hydrophilic regions in polypeptides 

10 useful in the invention. 

An immunogen typically is used to prepare antibodies by immunizing a suitable 
subject, (e.g., rabbit, goat, mouse or other mammal). An appropriate immunogenic 
preparation can contain, for example, a recombinantly expressed or a chemically synthesized 
polypeptide. The preparation can further include an adjuvant, such as Freund's complete or 

15 ' incomplete adjuvant, or similar immunostimulatory agent. 

Polyclonal antibodies can be prepared as described above by immunizing a suitable 
subject with a polypeptide of the invention as an immunogen. The antibody titer in the 
immunized subject can be monitored over time by standard techniques, such as with an 
enzyme linked immunosorbent assay (ELISA) using immobilized polypeptide. If desired, 

?o the antibody molecules can be isolated from the mammal (e.g., from the blood) and further 
purified by well-known techniques, such as protein A chromatography to obtain the IgG 
fraction. At an appropriate time after immunization, e.g., when the specific antibody titers 
are highest, antibody-producing cells can be obtained from the subject and used to prepare 
monoclonal antibodies by standard techniques, such as the hybridoma technique originally 

25 described by Kohler and Milstein, 1975, Nature 256:495-497, the human B cell hybridoma 
technique (Kozboret al., 1983, Immunol Today 4:72). the EBV-hybridomu technique (Cole 
ei al.. 1985, Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96) or 
[noma techniques. The technology for producing hybridomas is well known (see ^enerutlv 
Current Protocols in Immunology, 1994, Coligan et al. (eds.) John Wiley & Sons, Inc.. New 

30 York, NY). Hybridoma cells producing a monoclonal antibody of the invention are detected 
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by screening the hybridoma culture supemalants Tor antibodies that bind the polypeptide of 
interest, e.g., using a standard ELISA assay. 

Alternative to preparing monoclonal antibody-secreting hybridomas, a monoclonal 
antibody directed against a polypeptide of the invention can be identified and isolated by 
5 screening a recombinant combinatorial immunoglobulin library (e.g., an antibody phage 
display library) with the polypeptide of interest. Kits for generating and screening phage 
display libraries are commercially available (e.g.. the Pharmacia Recombinant Phage 
Antibody System, Catalog No. 27-9400-01 ; and the Stratagene SurjZAP™ Phage Display Kit, 
Catalog No. 240612). Additionally, examples of methods and reagents particularly amenable 

10 for use in generating and screening antibody display library can be found in, for example, 
U.S. Patent No. 5,223,409; PCT Publication No. WO 92/186 L9; PCT Publication No. WO 
91/17271; PCT Publication No. WO 92/20791; PCT Publication No. WO 92/15679: PCT 
Publication No. WO 93/01288; PCT Publication No. WO 92/01047; PCT Publication No. 
WO 92/09690; PCT Publication No. WO 90/02S09; Fuchs et aL 1991, Bio/Technology 

15 9:1370-1372; Hay et aL 1992, Hum. AntibocL Hybridomas 3:81-85; Huse et aL 1989, 
Science 246: 1275-1281; Griffiths et aL 1993, EMBO J. 12:725-734. 

Additionally, recombinant antibodies, such as chimeric and humanized monoclonal 
antibodies, comprising both human and non-human portions, which can be made using 
standard recombinant DNA techniques, are within the scope of the invention. Such chimeric 

20 and humanized monoclonal antibodies can be produced by recombinant DNA techniques 
known in the art for example using methods described in PCT Publication No. WO 
87/02671: European Patent Application 184,187; European Patent Application 171,496; 
European Patent Application 173,494; PCT Publication No. WO 86/01533; U.S. Patent No. 
4,816,567; European Patent Application 125,023: Better etaL 1988, Science 240: 1041- 

25 1043; Liu et aL, 1987, Proc. Natl. Acad. Sci. USA S4: 343 9-3443; Liu ct aL, 1987, J. 
Immunol. 139:3521-3526; Sun ct aL, 1987, Pwc. Nail. Acad. Sci. USA 84:214-218; 
Nishimura el aL, 1987, Cane. Res. 47:999-1005: Wood ct aL 19S5, Nature 3 14:446-449; and 
ShawctaL, I9S8. J. Natl. Cancer Inst. 80: 1553-1559): Morrison, 19S5, Science 229: 1202- 
1207; Oi ct aL 1986, Bio/1 techniques 4:214: U.S. Patent 5,225,539; Jones et uL 1986) 

30 Nature 321:552-525: Verhoeyan et aL, 1988, Science 239:1534; and Bcidler et aL, 19SS, J. 
Immunol. 141:4053-4060. 
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Completely human antibodies arc particularly desirable for therapeutic treatment oF 
human patients. Such antibodies can be produced using transgenic mice which are incapable 
of expressing endogenous immunoglobulin heavy and light chains genes, but which can 
express human heavy and light chain genes. The transgenic mice are immunized in the 
5 normal fashion with a selected antigen, e.g., all or a portion of a polypeptide of the invention. 
Monoclonal antibodies directed against the antigen can be obtained using conventional 
hybridoma technology. The human immunoglobulin transgenes harbored by the transgenic 
mice rearrange during B cell differentiation, and subsequently undergo class switching and 
somatic mutation. Thus, using such a technique, it is possible to produce therapeutically 

10 useful lgG, IgA, and IgE antibodies. For an overview of this technology for producing 
human antibodies, see Lonberg and Huszar(l995, Int. Rev. Immunol. 13:65-93). For a 
detailed discussion of this technology for producing human antibodies and human 
monoclonal antibodies and protocols for producing such antibodies, see, e.g., U.S. Patent 
5,625,126; U.S. Patent 5,633,425; U.S. Patent 5,569,825; U.S. Patent 5,661.016; and U.S. 

15 Patent 5,545,806. In addition, companies such as Abgenix, Inc. (Freemont, CA), can be 
engaged to provide human antibodies directed against a selected antigen using technology 
similar to that described above. 

Completely human antibodies which recognize a selected epitope can be generated 
using a technique referred to as "guided selection. " In this approach a selected non-human 

20 monoclonal antibody, e.g., a murine antibody, is used to guide the selection of a completely 
human antibody recognizing the same epitope. (Jespers et al., J 994, Biotechnology 12:899- 
903). 

An antibody directed against a polypeptide of the invention (e.g., monoclonal 
antibody) can be used to isolate the polypeptide by standard techniques, such as affinity 

25 chromatography or immunoprecipitation. Moreover, such an antibody can be used to detect 
the protein (e.g., in a cellular lysatc or cell supernatant) in order to evaluate the abundance 
and pattern of expression of the polypeptide. The antibodies can also be used diagnostically 
to monitor protein levels in tissue as part of a clinical testing procedure, e.g., for example, 
determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling 

30 the antibody to a detectable substance. Examples of detectable substances include various 
enzymes, prosthetic groups, fluorescent materials, luminescent materials, biolumincsceni 
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materials, and radioactive materials. Examples of suitable enzymes include horseradish 
peroxidase, alkaline phosphatase, beta-galactosidase, or acetylcholinesterase; examples of 
suitable prosthetic group complexes include strcptavidin/biolin and avidin/biotin; examples 
oF suitable Fluorescent materials include umbel lilcrone. Fluorescein, Fluorescein 
isothiocyanate, rhodamine, dichlorotriazinylaminc Fluorescein, dansyl chloride or 
phycoerythrin; an example of a luminescent material includes luminol; examples of 
bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable 
radioactive material include l25 I, ,3, 1, 35 S or 3 R 

Screening Assays 

The invention provides a method for identifying modulators, i.e., candidate agents or 
reagents, of expression or activity of a glucose transport-related nucleic acid or polypeptide. 
Such candidate agents or reagents include polypeptides, oligonucleotides, peptidomimetics, 
carbohydrates or small molecules such as small organic or inorganic molecules (e.g., non- 
nucleic acid small organic chemical compounds) that modulate expression (protein or 
mRNA) or activity of one or more glucose transport-related polypeptides or nucleic acids. In 
general, screening assays involve assaying the effect of a lest agent on expression or activity 
of a glucose transport-related nucleic acid or polypeptide in a test sample (i.e., a sample 
containing the glucose transport-related nucleic acid or polypeptide). Expression or activity 
in the presence of the test compound or agent is compared to expression or activity in a 
control sample (i.e., a sample containing a glucose transport-related polypeptide that was not 
incubated in the presence of the test compound). A change in the expression or activity of 
the glucose transport-related nucleic acid or polypeptide in the test sample compared to the 
control indicates that the test agent or compound modulates expression or activity of the 
glucose Inmsport-reJated nucleic acid or polypeptide and is a candidate agent. 

In one embodiment, the invention provides assays for screening candidate agents that 
bind to or modulate the activity of a polypeptide or nucleic acid of the invention or 
biologically active portion thereof. The compounds to be screened, can be obtained using 
any of the numerous approaches in combinatorial library methods known in the art. 
including: biological libraries; spatially addressable parallel solid phase or solution phase 
libraries; synthetic library methods requiring dcconvoluiion; the "one-bead one-compound" 
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library method; and synthetic library methods using affinity chromatography selection. The 
biological library approach is limited to peptide libraries, while the other four approaches are 
applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, 
1997, Anticancer Drug Des. 12:145). 
5 Examples of methods for the synthesis of molecular libraries can be found in the art, 

for example in: DeWitt et aL, 1993, Proc. Natl. Acad. Sci. USA 90:6909; Erb et aL 1994, 
' Proc. Natl. Acad Sci USA 91 : 1 1422; Zuckermann et al., 1994,7. Med. Client. 37:2678; Cho 
et al., 1993, Science 261:1303: Carrell et aL, 1994, Angew. Chem. Int. Ed. Engl. 33:2059; 
Carell et aL, 1994, Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et aL, 1994, J. Med. 

10 Chan. 37:1233, 

Libraries of compounds may be presented in solution (e.g., Houghten, 1992, 
Bio/Techniques 13:412-421), or on beads (Lam, 1991, Nature 354:82-84), chips (Fodor, 
1993, Nature 364:555-556), bacteria (U.S. Patent No. 5,223,409), spores (Patent 
Nos. 5.571,698; 5,403,484; and 5,223,409), plasmids (Cull et aL, 1992, Proc. Natl. Acad. Sci. 

15 USA 89: 1S65-1S69) or phage (Scott and Smith, 1990, Science 249:386-390; Devlin. 1990, 
Science 249:404-406; Cwirlaet aL, 1990, Proc. Nad. Acad. Sci. USA 87:637S-6382: and 
Felici, 1991,/ Mol. Biol. 222:301-310). 

In one embodiment, the assay is a cell-based assay in which a cell expressing a 
polypeptide of the invention, or a biologically active portion thereof, on the cell surface is 

20 contacted with a test compound. The ability of the test compound to bind to the polypeptide 
is then determined. The cell, for example, can be a yeast cell or a cell of mammalian origin. 
Determining the ability of the test compound to bind to the polypeptide can be accomplished, 
for example, by coupling the test compound with a radioisotope or enzymatic Libel such that 
binding of the test compound to the polypeptide or biologically active portion thereof can be 

25 determined by detecting the labeled compound in a complex. For example, test compounds 
can be labeled with l25 l, * ,5 S, l4 C, or 'H, either directly or indirectly, and the radioisotope 
detected by direct counting of radioemmission or by scintillation counting. Alternatively, 
test compounds can be enzymatically labeled with, for example, horseradish peroxidase, 
alkaline phosphatase, or luciferase. and the enzymatic label detected by determination of 

30 conversion of an appropriate substrate to product. In one embodiment, the assay comprises 
contacting a cell which expresses a membrane-bound form of a polypeptide of the invention. 
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or a biologically active portion thereof, on the cell surface with a known compound which 
binds to the polypeptide to form an assay mixture, contacting the assay mixture with a test 
compound, and determining the ability of the test compound to interact with the polypeptide, 
wherein determining the ability of the test compound to interact with the polypeptide 
comprises determining the ability of the test compound to preferentially bind to the 
polypeptide or a biologically active portion thereof as compared to the known compound. 

In another embodiment, an assay is a cell-based assay comprising contacting a cell 
expressing a membrane-bound form of a polypeptide of the invention, or a biologically active 
portion thereof, on the cell surface with a test compound and determining the ability of the 
test compound to modulate (e.g., stimulate or inhibit) the activity of the polypeptide or 
biologically active portion thereof. Determining the ability of the test compound to modulate 
the activity of the polypeptide or a biologically active portion thereof can be accomplished, 
for example, by determining the ability of the polypeptide to bind to or interact with a target 
molecule. 

Determining the ability of a polypeptide or nucleic acid of the invention to bind to or 
interact with a target molecule can be accomplished by one of the methods described herein 
for determining direct binding. As used herein, a "target molecule" is a molecule with which 
a selected polypeptide or nucleic acid (e.g., a polypeptide or nucleic acid of the invention) 
binds or interacts with in nature, for example, a molecule on the surface of a cell which 
expresses the selected protein, a molecule on the surface of a second cell, a molecule in the 
extracellular milieu, a molecule associated with the internal surface of a cell membrane or a 
cytoplasmic molecule. A target molecule can be a polypeptide or nucleic acid of the 
invention or some other polypeptide, protein or nucleic acid. For example, a target molecule 
can be a component of a signal transduction pathway which facilitates transduction of an 
extracellular signal (e.g., a signal generated by binding of a compound to a polypeptide of the 
invention) through the cell membrane and into the cell or a second intercellular protein which 
has catalytic activity or a protein which facilitates the association of downstream signaling 
moicculcs with a polypeptide of the invention. Determining the ability of a polypcpiide of 
the invention to bind to or interact with a target molecule can also be accomplished by 
determining the activity of the target molecule. For example, the activity of the target 
molecule can be determined by detecting induction of a cellular second messenger of the 
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target (e.g., intracellular Ca 2 "\ diucylglycerol, or IP3), detecting catalytic/enzymatic activity 
of the target on an appropriate substrate, detecting the induction of a reporter gene (e.g.. a 
regulatory clement that is responsive to a polypeptide of the invention operably linked to a 
nucleic acid encoding a detectable marker, e.g., luciFerase), or detecting a cellular response, 
5 Tor example, cellular differentiation, or cell proliferation. When the target molecule is a 
nucleic acid, the compound can be, e.g., a ribozyme or antisense molecule. 

In yel another embodiment, an assay of the present invention is a cell-free assay 
comprising contacting a polypeptide or nucleic acid of the invention, or biologically active 
portion thereof, with a test compound and determining the ability of the test compound to 

10 bind to the polypeptide or biologically active portion thereof. Binding of the test compound 
to the polypeptide can be determined either directly or indirectly as described above. In one 
embodiment, the assay includes contacting the polypeptide of the invention or biologically 
active portion thereof vvith a known compound which binds the polypeptide to form an assay 
mixture, contacting the assay mixture with a test compound, and determining the ability of 

15 the test compound to interact with the polypeptide (e.g., its ability to compete with binding of 
the known compound), wherein determining the ability of the test compound to interact with 
the polypeptide comprises determining the ability of the test compound to preferentially bind 
to the polypeptide or biologically active portion thereof as compared to the known 
compound. When the test compound is targeted to a nucleic acid, the binding of the test 

20 compound to the nucleic acid can be tested, e.g., by binding, by fragmentation of the nucleic 
acid (as when the test compound is a ribozyme), or by inhibition of transcription or 
translation in the presence of the test compound. 

In another embodiment, an assay is a cell-free assay comprising contacting a 
polypeptide of the invention or biologically active portion thereof with a test compound and 

25 determining the ability of the test compound to modulate (e.g., stimulate or inhibit) the 

activity of the polypeptide or biologically active portion thereof. For example, determining 
[he ability of the test compound to modulate the activity of the polypeptide can be 
accomplished by determining the ability of the polypeptide oFthe invention to modify the 
target molecule. Such methods can, alternatively, measure the catalytic/enzymatic activity of 

30 the target molecule on an appropriate substrate. In general, modulation of the activity of the 
polypeptide of the invention or biologically portion thereof is determined by comparing the 
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activity in Ihc absence of the test compound to the activity in the presence of the test 
compound. 

In yet another embodiment, the cell-free assay comprises contacting a polypeptide or 
nucleic acid of the invention, or biologically active portion thereof, with a known compound 
5 which binds to the polypeptide to form an assay mixture, contacting the assay mixture with a 
test compound, and determining the ability of the test compound to interact with the 
polypeptide or nucleic acid, wherein determining the ability of the test compound to interact 
with the polypeptide or nucleic acid comprises determining the ability of the polypeptide or 
nucleic acid to preferentially bind to or modulate the activity of a target molecule. 

10 The cell-free assays of the present invention are amenable to use of either a soluble 

form or the membrane-bound form of a polypeptide of the invention. In the case of cell-free 
assays comprising the membrane-bound form of the polypeptide, it may be desirable to 
utilize a solubilizing agent such that the membrane-bound form of the polypeptide is 
maintained in solution. Examples of such solubilizing agents include non-ionic detergents 

15 such as n-octylglucoside, n-dodecyjglucoside, n-octylmaltoside, octanoyl-N- 

methylglucamide, decanoyl-N-methylglucamide, Triton X-100, Triton X-l 14, Thesit, 
!sotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]- 1 -propane 
sulfonate (CHAPS), 3-[(3-cho)amidopropyl)dimethylamminio]-2-hydroxy-l-propane 
sulfonate (CHAPSO), or N-dodecyI-N,N-dimethyl-3-ammonio- 1 -propane sulfonate. 

20 In more than one embodiment of the above assay methods of the present invention, it- 

may be desirable to immobilize either the polypeptide of the invention or its target molecule 
to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, 
as well as to accommodate automation of the assay. Binding of a test compound to the 
polypeptide, or interaction of the polypeptide with a target molecule in the presence and 

25 absence of a test agent, can be accomplished in any vessel suitable for containing the 
rcactants. Examples of such vessels include microtitre plates, test tubes, and micro- 
centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain 
that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S- 
transferase fusion proteins or glutathione-S-transferasc fusion proteins can be adsorbed onto 

30 glutathione sepharose beads (Sigma Chemical; St. Louis. MO) or glutathione dcrivatized 

microtitre plates, which are then combined with the test compound or the test compound and 
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either the non-adsorbed target protein or a polypeptide of the invention, and the mixture 
incubated under conditions conducive to complex formation (e.g., at physiological conditions 
for salt and pH). Following incubation, the beads or microtitre plate wells are washed to 
remove any unbound components and complex formation is measured either directly or 
indirectly, for example, as described above. Alternatively, the complexes can be dissociated 
from the matrix, and the level of binding or activity of the polypeptide of the invention can 
be determined using standard techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the 
screening assays of the invention. For example, either the polypeptide of the invention or its 
target molecule can be immobilized utilizing conjugation of biotin and streptavidin. 
Biotinylated polypeptide of the invention or target molecules can be prepared from biotin- 
NHS (N-hydroxy-succinimide) using techniques well known in the art (e.g., biotinylation kit, 
Pierce Chemicals; Rockford, I.L), and immobilized in the wells ofstreptavidin-coated 96 well 
plates (Pierce Chemical). Alternatively, antibodies reactive with the polypeptide of the 
invention or target molecules but which do not interfere with binding of the polypeptide of 
the invention to its target molecule can be derivatized to the wells of the plate, and unbound 
target or polypeptide of the invention trapped in the wells by antibody conjugation, Methods 
for detecting such complexes such as GST-immobilized complexes, include 
immunodetection of complexes using antibodies reactive with the polypeptide of the 
invention or target molecule, as well as enzyme-linked assays which rely on detecting an 
enzymatic activity associated with the polypeptide of the invention or target molecule. 

In another embodiment, modulators of expression of a polypeptide of the invention 
are identified in a method in which a cell is contacted with a test agent or compound and the 
expression of the selected mRNA or protein (i.e., the mRNA or protein corresponding to a 
polypeptide or nucleic acid of the invention) in the cell is determined. The level of 
expression of the selected mRNA or protein in the presence of the test agent is compared to 
the level of expression of the selected mRNA or protein in the absence of the test a°enl. The 
test agent can then be identified as a modulator of expression of the polypeptide (i.e., a 
candidate compound)of the invention based on this comparison. For example, when 
expression of the selected mRNA or protein is greater (statistically significantly greater) in 
the presence of the test agent than in its absence, the test agent is identified as a candidate 
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agent thai is a stimulator of the selected mRN A or protein expression. Alternatively, when 
expression of the selected mRNA or protein is less (statistically significantly less) in [he 
presence of the test agent than in its absence, the test agent is identified as a candidate agent 
that is an inhibitor of the selected mRNA or protein expression. The level of the selected 
5 mRNA or protein expression in the cells can be determined by methods described herein. 

In yet another aspect of the invention, a polypeptide of the inventions can be used as 
"bait proteins" in a two-hybrid assay or three hybrid assay (see, e.g., U.S. Patent No. 
5,283,317; Zervos et ah, 1993, Cell 72:223-232: Madura et al M 1993, / Biol. Chem. 
268:12046-12054; Bartel et a!., 1993, Bio/Techniques 14:920-924; Iwabuchi et a!., 1993, 
10 Oncogene 8:1693-1696; and PCT Publication No. WO 94/10300), to identify other proteins, 
that bind to or interact with the polypeptide of the invention and modulate activity of the 
polypeptide of the invention. Such binding proteins are also likely to be involved in the 
propagation of signals by the polypeptide of the inventions as, for example, upstream or 
downstream elements of a signaling pathway involving the polypeptide of the invention. 

15 

Electronic Data Storage and Processing 

The invention includes nucleic acid and polypeptide sequences that are provided in 
digital form that can be transmitted and read electronically (e.g., in a database). In some 
embodiments, the database can be queried for comparison with data provided (e.g., a nucleic 

20 acid sequence or a pattern of expression). All sequence information or data provided for 
comparison with the database can be transmitted to the database, e.g., by email, via the 
Internet, on diskette, or any other mode of electronic or non-electronic communication. 

The invention thus features an electronic method of determining whether a patient has 
a glucose-transport related disorder by obtaining an electronic form of a nucleic acid 

25 sequence from the patient; obtaining a database of nucleic acid molecules whose expression 
is altered in a glucose transport-related disorder such as type tl diabetes that includes nucleic 
acid molecules of individuals with glucose-transport related disorders; and comparing the 
patient nucleic acid sequence with the nucleic acid molecules in the database, wherein a 
patient nucleic acid sequence that matches a nucleic acid molecule in the database indicates 

30 the patient has or is at risk for a glucose-transport related disorder. 
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The invention also includes a database That includes an electronic Form (e.g., digital 
form) of the nucleic acid molecules of the invention, and a computer-readable instructions for 
a processor to carry out the comparison method. The database can also be stored on a 
machine- or computer-readable medium, and can be accessed, e.g., through a 
5 communications network, such as the Internet. 

As used herein, "sequence information" refers to any nucleotide and/or amino acid 
sequence information, including but not limited to full-length nucleotide and/or amino acid 
sequences, partial nucleotide and/or amino acid sequences. Moreover, information "related 
to" the sequence information includes detecting the presence or absence of a sequence (e.g., 

10 detection of expression of a sequence, fragment, or polymorphism), determination of the 
level of a sequence (e.g., detection of a level of expression, for example, a quantitative 
detection), detection of a reactivity to a sequence (e.g., detection of protein expression and/or 
levels, for example, using a sequence-specific antibody), detection of a pattern of expression 
of two or more sequences, and the like. These sequences can be read by electronic apparatus 

15 and can be stored on any suitable medium for storing, holding, or containing data or 
information that can be read and accessed by an electronic apparatus. Such media can 
include, but are not limited to: magnetic storage media, such as floppy disks, hard disk 
storage medium, and magnetic tape; optical storage media such as compact disks; electronic 
storage media such as RAM, ROM, EPROM, EEPROM and the like; general hard disks and 

20 hybrids of these categories such as magnetic/optical storage media. The medium is adapted 
or configured for having recorded thereon sequence information. 

As used herein, the term "electronic apparatus" is intended to include any suitable 
compuling or processing apparatus or other device configured or adapted for storing data or 
information. Examples of electronic apparatus suitable for use with the present invention 

25 include stand-alone computing apparatus such as personal computers (PCs) and large 

computer systems. These systems can be accessed by communicalions networks, including 
local area networks (LAN), wide area networks (WAN), Internet, Intranet, and Extranet. For 
example, the database can be made available on an Internet website. 

As used herein, "stored" refers to a process for encoding information on the electronic 

30 apparatus readable medium. Those skilled in the art can readily adopt any of the presently 



- 36- 



WO 02/33046 PCT/USOJ/49451 

known methods for recording information on known media to generate manufactures 
comprising the sequence information. 

A variety of software programs and formats can be used to Store the sequence 
information on the electronic apparatus readable medium. For example, the sequence 
information can be represented in a word processing text file, formatted in commercially- 
available software such as WordPerfect® and Microsoft® Word®, or represented in the 
form of an ASCII file, stored in a database application, such as DB2®, Sybase® , Oracle®, 
or the like, as well as in other forms. Any number of data processor structuring formats (e.g.. 
text file or database) can be employed to obtain or create a medium having recorded thereon 
the sequence information. 

By providing sequence information in machine or computer-readable form, one can 
routinely access the sequence information for a variety of purposes. For example, one skilled 
in the art can use the sequence information in computer-readable form to compare a specific 
sequence with the sequence information stored within a database. Search means are used to 
identify fragments or regions of the sequences that match a particular sequence. 

The present invention therefore provides a medium for storing or holding a database 
or instructions for performing a method for determining whether an individual has a specific 
disease or disorder related to glucose transport or a pre-disposition for a specific disease or 
disorder related to glucose transport, wherein the method can include analyzing the 
individual's sequence information and based on the sequence information, determining 
whether the individual has a particular disorder or a predisposition for a particular disorder 
associated with a specific genetic sequence, and/or recommending a particular treatment for 
the disorder or pre-disorder condition. For example, the pattern of expression of glucose 
transport-related sequences or proteins from an individual suspected of having a glucose 
transport-related disorder (e.g., type U diabetes) can be analyzed, and. based on the analysis 
(e.g., aberrant expression of one or more glucose transport-related genes), a diagnosis 
provided and instructions for treatment. 

The invention will be further described in the following examples which do not limit 
the scope of the invention described in the claims. 
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EXAMPLES 

Three approaches were used to identify genes and proteins involved in glucose 
transport. First, several subtract! ve cDNA libraries were constructed that consist of genes 
5 selectively expressed in insulin-responsive tissues. Furthermore, it has been discovered that 
at least two of these genes have a role in regulating GLUT4 translocation. As a second 
approach, microarrays were screened with lluorcscently labeled probes synthesized from 
mRNA isolated from insulin-responsive tissues. In the third approach, a subcellular fraction 
was prepared that was enriched for vesicles involved in glucose transport. Proteins from this 
10 fraction were prepared and analyzed using microsequencing techniques. Additional analysis 
comparing the predicted protein sequences obtained in the first two approaches with the 
vesicle protein sequences provided a subset of sequences involved in glucose transport that 
are useful for certain aspects of the invention. 

15 Example 1: Subtractive Libraries 

Two methods were used to construct subtractive libraries. 

In the first method, suppression subtractive hybridization was used (Diatchenko et al., 
1996, Proc Natl. Acad. Sci U S A 93:6025-30). In this method, a first library was 
constructed that consisted of sequences that are highly expressed in muscle, but not in 3T3- 

20 LI fibroblasts (available from American Type Culture Collection; ATCC). The second 
library consisted of sequences that are highly expressed in 3T3-LI adipocytes, but not in 
3T3-L1 fibroblasts* The general method for this procedure is diagrammed in Fig. 4. 

Libraries were constructed by reverse transcription of total mRNA isolated from 
plates of confluent 3T3-LI fibroblasts and 3T3-L1 adipocytes 9 to 10 days after the start of 

25 differentiation. The resulting cDNAs were then digested with the restriction enzyme Rsa I. 
Digested adipocyte cDiNA was divided into two pools, and each pool was ligated to a 
different oligonucleotide adaptor. Adaptor 1 was: 

5 -CTAATACGACTCACTArAGGGCTCGAGCCGCCGCCCGGCCAGGr-3' (SEQ ID NO:94) 
30 CGCCCGTCCAo (SEQ ID NO:95) 

Adaptor 2 was: 
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5'-CrAATAC'GACTCACTATAGGGCAGCGTGGTCGCGGCCGAGGT-3' (SEQ ID NO:96) 

GCCGGCTCCA-5' (SEQ ID NO:97) 

Each pool of adipocyte cDNA (tester DNA) was then hybridized with an excess of fibroblast 
cDNA (driver DNA) for 9 hours at 68° C. The two hybridization mixtures were combined 
and incubated overnight at 68° C. After hybridization the 5' overhangs were filled in with 
Taq DNA polymerase, and amplified by PGR using primers that are homologous to each of 
the adaptors. This subtraction procedure was also performed using the mouse muscle cDNA 
as the tester, and 3T3-L1 fibroblast cDNA as the driver. 

As a test to demonstrate that muscle and adipocyte specific transcripts are amplified 
by this procedure, the final products of both subtractions were amplified using PCR primers 
internal to GLUT4 and a-tubulin transcripts, The final product of muscle subtraction (SUB) 
and the unsubtracted muscle cDNA (UNSUB) were used for PCR analysis with primers 
internal to the coding regions of Glut4 (G4) and a-tubulin- 1 (TUB). Glut4 and a-tubulin- 1 
primers were designed to amplify 485 bp and 408 bp fragments respectively. PCR samples 
were removed after 23, 28 and 33 PCR cycles and loaded onto a 1 .5% TAE (40mM Tris- 
acetate, pHS.O ImM EDTA) agarose gel. The gel was stained with ethidium bromide and 
visualized with UV light. As expected, GLUT4 cDNA (representing GLUT4 expression) 
was found in the subtracted muscle cDNA but tubulin cDNA was present in relatively small 
amounts because tubulin is expressed in both fibroblasts and muscle (and so a substantial 
amount of the tubulin sequence was subtracted out). GLUT4 is expressed in muscle but not 
in fibroblasts, and so, as expected present in relatively large amounts. In the muscle- 
subtracted cDNA, the GLUT4 signal is stronger in earlier PCR cycles, while the tubulin 
signal in suppressed. Similar results were obtained with PCR analysis with 3T3-LI 
adipocytc-subtracted cDNA. 

To construct the libraries, the final PCR products from the 3T3-LI adipocyte 
subtraction were digested with Rsa I and cloned into Eco RV restricted the pBlucscript SK+ 
vcctor (STRATAGENE®) creating a library of adipocyte subtract! vc clones. The library 
contained approximately 2 X 10 3 clones. The cloned plasmid DNA sequences were analyzed 
by didcoxy sequencing with either the M 13 -20 or reverse primer on an ABI 377 automatic 
sequencer. In an initial round of sequencing, I S3 independent clones, representing 
expression from 65 different genes, were sequenced. Sequences were analyzed in a search 
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against the non-redundant (NR) nucleotide database using the Blast program at 
wvvw.ncbi.nlm.nih.gov/blast/blasLcgi . The gapped BLAST program was used against the 
non-redundant or the dbest database. All BLAST searches were performed using the default 
settings which are: Expect^lO; Filter for Low complexity: on; Filter for Human Repeats: off: 
Mask for lookup table only: off; Matrix=Blosum62; Gap existence cost=i 1; Per residue gap 
costal: Lambda ratio— 85. 

Genes previously shown to be preferentially expressed or not preferentially expressed 
in adipocytes are those in which their mRNA expression profiles have been published in 
journal articles in the Medline database. A summary of these sequences is shown in 
Figs. 6A-6E. Approximately 60% of the sequenced clones in this library were from genes 
previously reported as overexpressed in 3T3-LL adipocytes. Another 23% of the clones 
consisted of known gene sequences whose expression pattern was known in adipocytes, 
while 13% of the sequenced clones had unknown (previously unreported) sequences. Four 
percent of the cloned sequences are from genes of mitochondrial origin. The identity of the 
genes in the subtractive library that have already shown to be preferentially expressed in 
3T3-LI adipocytes are listed in Figs. 6A-6E. Genes, such as adipoQ and stearoyl-CoA 
desaturase, that are found at the highest frequency in this subtractive library are also those 
that were discovered in previous attempts to clone genes that are highly expressed in 3T3-LI 
adipocytes upon differentiation (Ntambi et aL I9S8J. Biol. Chem 263:17291-17300; 
Bemlohret aL, 1984, Proc. Nat. Acad. Sci. USA 81: 5468-5472: Hu et a!., 1996 J. Biol. 
Chem. 271:10697-10703; Min and Spiegelman. 1986, Nucleic Acids Res. 14:8879-8892). 

Sequences that are expressed in the Adipocyte Subtractive library that are from genes 
with unknown function are listed in Figs. 3A-3E. 

Example 2: Construction of a Muscle-Adipocyte Library 

To identify genes encoding proteins that arc involved in glucose transport, gene 
expression in 3T3-LI adipocytes and muscle was investigated. To accomplish this, another 
library was constructed consisting of genes that fulfilled the following two criteria. First, the 
genes had to be highly expressed in both 3T3-LI adipocytes and mouse skeletal muscle; 
second, the genes could not be highly expressed in 3T3-LI fibroblasts. This library, the 
Muscle-Adipocyte Union Library (MAU library), was constructed using a modification of 
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the suppression subtractive hybridization technique (Fig. 5). The method was like the 
subtractivc suppression modification technique described in Fig. 4 except that adaptor I was 
ligated lo Rsa I-digested 3T3-LI adipocyte cDNA while adaptor 2 was ligated to Rsa I- 
digested mouse muscle cDNA. Both cDNAs were then hybridized to an excess of 3T3-U 
5 fibroblast DNA. The two hybridization reactions were then mixed to create hybrid molecules 
in which one strand originated from adipocytes and the second strand of the hybrid was from 
muscle. Because only these hybrid molecules have different adaptors on each end, they can 
be PCR amplified, unlike the rest of the cDNAs. These hybrid products were then amplified 
using PCR. The final PCR products of the 3T3-LI muscle-adipocyte union subtraction were 

10 cloned into overhang vector pCR2. 1. (INVITROGEN 0 ) to produce a library of 

approximately 10 4 clones. Plasmid DNAs were dideoxy sequenced with the either the MI3- 
20 or reverse primer on an ABI 377 automatic sequencer. Sequences were searched against 
the non-redundant (NR) nucleotide database using the Blast program at 
www.ncbi.nlm.nih.oov/blast/blast.cgi . Genes previously shown to be overexpressed or not 

15 overexpressed in adipocytes are those in which their mRNA expression profiles has been 
published in journal articles in the Medline database. Figs. 7A-7U show the summary of 
sequences from this library. These clones represent as many as 265 different genes. About 
40% of these sequences are expressed from genes that have previously been shown to be 
preferentially in muscle, adipocytes, or both tissues. Another 26% of the clones are 

20 sequences from known genes whose expression profile is not known, and 17% of the clones 
represent previously unidentified genes. A large percentage of sequences (12%) represent 
genes of mitochondrial origin. Fig. I shows sequences from this library that are novel, and 
Figs. 2A-2R show the sequences of selected clones from this library. Figs. 7A-7U show the 
genes that encode the sequences identified in the MAU library including the GcnBank 

25 accession no., when one is known. Figs. 7A-7U also list the homologous human genes for 
these sequences and the expression profile of each sequence with respect to its expression in 
adipocytes and muscle. 

The sequences identified in this manner are useful, e.g., for detecting a glucose 
transport-related disorder such as type II diabetes. 
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Example 3: mRNA Expression Profiles of Unknown Genes in the 3T3-L1. Adipocyte 
Subtractive and the Muscle-Adipocyte Union Libraries. 

To determine the expression profile of cognate RNA from library clones that have not 
been previously reported to be overexpressed (i.e., preferentially expressed) in insulin- 

5 sensitive (e.g., adipocyte and muscle) tissues, expression of these sequences was analyzed in 
undifferentiated 3T3-LI cells and differentiated 3T3-L1 adipocytes. Northern blot analysis 
was used in which 3T3-LI and mouse multi-tissue Northern blots were probed. 

Cloned inserts from the Adipocyte Subtractive Library clones were labeled with 32 P- 
dCTP and used in an initial screen to probe Northern blots of total RNA from 3T3-L1 

10 fibroblasts and adipocytes. For Northern blotting, 3T3-LL and multiple tissue total RNA (10 
pg) were electrophoresed on 1.2% agarose/6.6% formaldehyde gels, then transferred to 
Nytran membranes. Before transfer, gels were stained with ethidium bromide and visualized 
with UV light in order to confirm equal loading of RN As. Blots were probed with inserts 
containing fragments of previously unidentified genes from both libraries Probes were 

15 labeled with P 32 -dCTP and incubated with the membranes overnight at 42°C Blots were 
washed twice with 2x SSC/0.1%SDS at room temperature, twice in 0.2x SSC/0.1% SDS at 
room temperature and twice in 0.2x SSC/0.1% SDS at 42 l, G. After washing, blots were 
exposed to a phosphor screen for one to three days. Phosphor screens were scanned with the 
Storm 860 Scanner from Molecular Dynamics. Full-length clones for many of these 

20 unknown genes have been obtained either by purchasing IMAGE Consortium clones or by 
screening muscle or adipocyte lambda libraries (such libraries can be made using methods 
known in the art). 

Seventy-eight clones from the Adipocyte Subtractive Library were characterized. 
Sixty of the 78 cloned sequences (approximately 757<?) were preferentially expressed upon 
25 adipocyte differentiation (i.e., in 3T3-L1 adipocytes). 

Thirty-two clones from the 3T3-L1 Muscle-Adipocyte Union library (MAU library) 
were analyzed. Nineteen were preferentially expressed in 3T3-LI adipocytes. This leads to 
the conclusion that approximately 50% of the clones in the MAU library, whose expression 
has not previously been reported, are preferentially expressed in 3T3-LI adipocytes. This 
30 indicates that approximately 80% of the clones in the 3T3-LI Adipocyte Subtractive Library 
and 70% of Ihe clones in the Muscle-Adipocyte Union Library (MAU library) are highly 
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expressed in at least one insulin-sensitive tissue. (For the 3T3-L1. Adipocyte Subtructive 
Library, 60% of sequences previously shown to be preferentially expressed + Vi of 40% = 
80%: for MAU library, 40% of sequences previously shown to be preferentially expressed + 
X A of 60% of uncharacterized genes = 70%). Genes that were found to be preferentially 
5 expressed in 3T3-L1 adipocytes were used lo probe mouse multi-tissue Northern blots. 
Using Northern analysis, it was confirmed that 1 1 previously unidentified genes from the 
MAU library (i.e., genes expressed in adipocytes and muscle) are expressed in at least two 
different insulin-sensitive tissues (see Figs. 6A-6E and 7A-7U; "overexpressed" indicates 
that the sequence was found to be preferentially expressed in insulin-sensitive cells in these 

10 experiments). 

Using multi-tissue Northern blots it was shown that that six previously identified 
genes arc highly expressed in insulin-sensitive tissues. Furthermore, at least two of these 
proteins have a role in regulating GLUT4. This was determined as follows. Three clones in 
the Musclc-Adipocyte Union Library consist of the 3' end of PP2Cal fGenbank Acession 

15 N0.D2SII7 KatoetaL, 1994, Gene 145:311-312). Northern blot analysis demonstrated that 
at least three transcripts of PP2Ca are highly expressed in both 3T3-LI adipocytes and in 
mouse fat. We further examined mRNA expression of PP2Ca. For Northern blotting. 3T3- 
Ll and multiple tissue total RNA (10 jitg) were separated by electrophoresis on 1 .2% agarose/ 
6.6% formaldehyde gels, then transferred on to Nytran membranes. Blots were probed with 

20 library clone c0452, which contains the last 216 base pairs of the PP2Cal coding sequence 
along with the 2SS base pairs of 3' noncoding region. Probes were labeled with P" c -dCTP 
and incubated with the membranes overnight at 42°C. Blots were washed twice with 2x 
SSC/0.i%SDS at room temperature, twice in 0.2x SSC/0.1% SDS at room temperature and 
twice in 0>2x SSC/0. 1 % SDS at 42°C. After washing. 3T3-LI blots were exposed to film For 

25 one day, while multi tissue northern blots were exposed to a phosphor screen for one lo three 
days. Phosphor screens were scanned with the Storm S60 Scanner from Molecular 
Dynamics. 

To assess the role of PP2Cal in insulin-stimulated glucose transport. PP2Cal protein 
was microinjected into 3T3-L1 adipocytes, and GLUT4 translocation was determined by 
30 immunofluorescence. Microinjection of PP2Cal was found to potentiate the ability of a 

submaximal 1 nM concentration of insulin to translocate GLUT4 to the plasma membrane to 
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levels close if not equal to that of a maximal lOniW insulin stimulation. To examine the 
effect of microinjectcd PP2CaI on GLUT4 translocation, 3T3-LI adipocytes were incubated 
in serum free medium for two hours and microinjected with either IgG alone or PP2Ca along 
with IgG. Sixty minutes later adipocytes were incubated with media alone, 1 nM insulin or a 
5 maximally effective concentration of insulin (10 nM) for 30 minutes, Cells were then fixed 
with methanol and then stained with anti-GLUT4 antibody. Adipocytes were examined 
using fluorescence microscopy (Zeiss Axioskop, at 630x magnification) and scored for 
scored for the presence of substantial cell surface GLUT4 immunoreacti vity at the plasma 
membrane. Controls are cells on the same coverslips that were not injected. Microinjection 
10 of phosphatases 2A or 2B had no effect on the ability of insulin to activate GLUT4 
translocation. Western blotting has also revealed that PP2Ca selectively co- 
immunoprecipitates insulin receptors but not PDGF receptors in an insulin-enhanced manner. 

Call (Q209L) Induced 2-Deoxyglucose Uptake in Differentiated 3T3-U Adipocytes. 

15 Gal l sequence (Genbank Accession No. U374I l ; Davignon et aL 1 996, Genomics 

31:359-366) was identified in the 3T3-LL Adipocyte Subtract! ve Library. This protein is a 
member of the Gaq family which are heterotimeric components of G protein complexes. 
Northern blot analysis confirmed that Gal I expression is induced upon 3T3-LI adipocyte 
differentiation, and that it is more abundant by far in fat than in any other tissue. 

20 Differentiated 3T3-L1 adipocytes were seeded at 150,000 cells per well in 24 well plates and 
then infected with either control or Gal 1 (Q209L) adenoviruses. Thirty hours after 
infection, plates were serum starved for two hours in Krebs-Ringer phosphate buffer with 
BSA and pyruvate. Plates were then treated with or without wortmannin (a specific inhibitor 
of PI3 kinase) for 15 minutes followed by stimulation with insulin or endothelin for 

25 30 minutes. Cells were then assayed for 2-deoxyglucosc uptake as described in Frost and 
Lane (1985, J. Biol. Chem. 260:2646-2652). 

For Northern blotting, 3T3-LI and multiple tissue total RNA-( 10 pg) were separated 
on 1.2% agarose/ 6.6% formaldehyde gels, then transferred on to Nylran membranes. Blots 
were probed with library clone b0031, which contains nt 237 to nt 435 of the Gul 1 coding 

30 sequence. Probes were labeled with P 32 -dCTP and incubated with the membranes overnight 
at 42°C. Blots were washed twice with 2x SSC/0.1%SDS at room temperature, twice in 0.2.x 
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SSC/0.1% SDS at room temperature and twice in 0.2x SSC/0A7o SDS at 42X. After 
washing. 3T3-L1 blots were exposed to film for one day. while multi tissue northern blots 
were exposed to a phosphor screen for three days. Phosphor screens were scanned with the 
Storm 860 Scanner from Molecular Dynamics. A closely related protein Gq did not have this 
5 expression profile. Infection of 3T3-L1 adipocytes with a recombinant adenovirus 

expressing a constitutively active form of Gal 1 expression, but not the native protein led to 
an increase in GLUT4 concentration in the plasma membrane, and a fourfold increase in 
glucose uptake in a wortmannin-insensitive manner. Thus, wortmannin does not inhibit the 
ability of the active form of Gal 1 to stimulate GLUT4 translocation. 

10 Since PI3 kinase activation is required for insulin to activate GLUT4 translocation, 

these data indicate that Gal i is likely a mediator of PI3 kinase independent activators of 
GLUT4 translocation, such as endothelin. In addition, these data demonstrate that glucose 
transport-related genes were identified using the methods described herein. They also 
illustrate an assay for identifying glucose transport-related sequences that are P13 kinase 

15 independent activators of GLUT4 translocation. 

Example 4: Polypeptides Isolated from GLUT4-Enriched Vesicles 

The GLUT4 glucose transporter resides primarily in perinuclear membranes in 
unstimulated 3T3-L1 adipocytes and is acutely translocated to the cell surface in response to 
20 insulin. A novel method of purifying intracellular GLUT4-enriched membranes was used to 
identify polypeptides involved in glucose transport. 

Antibodies 

Rabbit polyclonal anti-GLUT4 antibody was raised against the C-terminul 1 2 amino 
25 acid sequence of GLUT4. Mouse anti-transfcrrin receptor was from Zymed. Rabbit 
polyclonal anti-VAMP2 antibody was from StressGen Biotechnologies Corp. Mouse 
monoclonal anti-vimentin antibody used in immunoblots and immuno-electron microscopy 
analysis was from Santa Cruz. Mouse monoclonal anti-CHubulin antibody, used in 
immunoblot and immuno-electron microscopy analysis and the secondary antibodies 
30 conjugated to gold particles for immuno-electron microscopy were from Amersham 
Pharmacia Biotech. 
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Immunoblotting 

Fractions from velocity gradients and equilibrium density gradient were prepared as 
described above and aliquots from these fractions were subjected to SDS-PAGE on resolving 
gels according to Laemmli (1970, Nature 227:630-685). Separated proteins were 
eleclrophoretically transferred to nitrocellulose membrane, blocked with 3% nonfat milk and 
1% BSA in TTBS (0.05% Tween 20 in Tris-bufFered saline) and then incubated with primary 
antibody in TTBS containing 1% BSA. After incubation, membranes were washed with 
TTBS and incubated with horseradish peroxidase-labeled anti-mouse JgG for the detection of 
monoclonal antibodies or with horseradish peroxidase-labeled anti-rabbit IgG for detection of 
polyclonal antibodies. Proteins were visualized using an enhanced chemiluminescent 
substrate kit (Amersham Pharmacia Biotech) and immunoblot intensities were quantified by 
a scanning densitometer. 

Electron Microscopy 

GLUT4-containing membranes of the insulin sensitive fractions from the equilibrium 
density gradient were isolated as described above. Fractions were pooled, pelleted by 
centrifugation at 48,000 rpm for 2 hours, resuspended in PBS and fixed in a final 
concentration of 2% paraformaldehyde in PBS. GLUT4-vesicles were then adsorbed to 
Fo rmvard-coated gold grids and processed for double labeling as outlined in Martin et al. 
{supra) and Sleeman et al. (1998, J. Biol. Chem. 273:3132-3135). Grids were incubated with 
50j.il of primary antibody diluted in 1% BSA and PBS as follows: anti-GLUT4, anti-IRAP, 
anti-vimentin, anti-a-tubulin or non-immune IgG, as a negative control. After incubation 
with each IgG fraction, grids were labeled with either 5 or 15 nm gold particles conjugated to 
the secondary antibody (goat anti-rabbit or goat anti-mouse). Grids were stained with \7o 
uranyl acetate, dried and viewed using a transmission electron microscope PHILLIPS 
CM. 10. 

Purification of insulin-responsive GLUTJ-cuiuaining membranes 
GLUT4-containing membranes were prepared by first isolating low density (LD) 
microsomes then subjecting these to further purification on sucrose velocity gradients. 
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Finally, the GLUT4 fractions from the sucrose gradients were subjected to equilibrium 
density sucrose gradients. The preparations were made From primary, unstimulated or insulin 
stimulated rat adipocytes, although the could also be prepared from other tissues, e.g., striatal 
muscle. 

To prepare the initial crude membrane preparations for purification, adipocytes were 
isolated from epididymal fat pads of Male Sprague-Dawley Rats (125-150 g) by collagenase 
digestion in Krebs-Ringer/HEPES, pH 7.4, supplemented with 2% bovine serum albumin and 
2 mM pyruvate. Following digestion, the cells were washed arid permitted to recover for 
30 minutes. The ceils were then incubated at 37°C with or without 100 nM insulin for 
20 minutes. The cells were washed with PBS and immediately homogenized in buffer A 
(50 mM HEPES, pH 7.4, 10 mM NaF, L mM iNaPPi, 0. 1. mM Na 3 V0 4 , 
I mM phenylmethylsulfonyl fluoride, 10 flg/m! aprotinin, and 10 \i°/m\ leupeptin), and then 
subjected to differential centrifugation as described in Czech and Buxton, 1993, J. Biol. 
Chem. 268:9187-9190. Low density microsomes were prepared by modifications of 
previously described methods (Mackeell, D.W. and Janet, L., 1970, J. Cell Biol., 
44:417432). Briefly, cells were homogenized for 15 strokes with a motor-driven 
Teflon/glass homogenizer in 24 ml of buffer containing 10 mM Tris-CI, pH 7.4, 1 mM 
EDTA, 250 mM sucrose, 10 mM NaF, 1 mM phenylmethylsufonyl fluoride. The 
homogenate were brought to 4°C and centrifuged for 20 minutes at 16,000 x g. The 16,000 x 
g supernatant was centrifuged at 48,000 x g for 20 minutes to obtain a pellet of high density 
microsomes and the resulting supernatant was centrifuged for 90 minules at 200,000 x g to 
obtain a pellet of low density microsomes. The low density microsomes were resuspended at 
a final concentration of approximately 1-3 mg/ml. Protein was quantified using the 
bicinchoninic acid protein determination kit (Pierce) with bovine scrum albumin as standard. 

GLUT4-enriched fractions were then isolated from LD microsomal fractions utilizing 
the sedimentation sucrose velocity gradient centrifugation (Kandror et a!., 1995, Biochem. J. 
307:383-390: Heller-Harrision,et a!., 1996, J. Biol. Chem. 271:10200-10204). Briefly, 1.5 to 
2 mg of LD microsomal fractions were loaded onto a 10-35% sucrose velocity gradient 
(sucrose in buffer B: 20 mM HEPES, pH 7.4, 100 mM NaCI, I mM EDTA, 
2 mM dithiothreitol, 1 mM, 10 mM NaF. I mM NaPPi.O.I mM Na 3 V0 4 . 
I mM phenylmethylsulfonyl fluoride, 10 }Ag/ml aprotinin and 10 (.ig/ml leupeptin) and 
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centrifuged For 3,5 hours at 1 10,000 x g rpm in an SW28 rotor (Beckman) and I ml fractions 
were collected. The crude membrane fraction contains most of the GLUT4 present in 
unstimulated adipocytes and is composed primarily of intracellular membranes (Czech and 
Buxton, supra). This additional centri legation step separates about 90% of the total 
membrane protein (fractions 1-7) from the GLUT4-enriched membranes (fractions 8-18). 

Insulin treatment of rat adipocytes prior to disruption of the cells and preparation of 
these membranes causes a marked decrease in the yield of GLUT4 present in the latter 
fractions. However, no such insulin effect is observed when total membrane protein is 
measured because these membranes are still highly contaminated with membranes that do not 
contain GLUT4 and are not insulin-responsive. 

To further resolve the membrane species associated with GLUT4, fractions 8- IS 
which contained most of the GLUT4 from the sucrose velocity gradient were subjected to 
equilibrium gradient centrifugation. Fractions from sucrose velocity gradients containing 
GLUT4-membranes (Fractions 8 to 18) were pooled, pelleted by ultracentrifugation at 
48,000 ipm for 1.5 hours, resuspended in buffer B and then loaded onto an equilibrium 
density sucrose gradient (10-65% (w/v) in buffer B and centrifuged at 150,000 x g rpm for IS 
hours in a SW 50.1 rotor (Beckman). After centrifugation, 0.25 ml fractions were collected 
starting from the top of the gradient. Fractions were analyzed for the total protein content 
using a Bradford assay (Bio-Rad)> 

Most of the membrane protein was distributed over fractions 5-20 after this 
procedure, whereas most of the GLUT4 was distributed within fractions 7-14. Importantly, 
GLUT4 was localized into two types of membranes (GLUT4 membranes) that can be 
distinguished based on their sensitivity to insulin. The amount of GLUT4 in fractions 7-9 
(peak I) was decreased when the cells were treated with insulin before homogenization and 
preparation of membranes, whereas the GLUT4 in fractions 10-20 (peak 2) was not affected 
by insulin treatment of the adipocytes. Strikingly, measurement of total membrane protein in 
the fractions of this gradient revealed a similar profile: about a 50% reduction in fractions 7-9 
due to insulin action, with no insulin effect observed in fractions 10-20. This observed 
insulin-mediated decrease in total membranes recovered in fractions 7-9 indicates the 
successful partial purification of membranes of the insulin-responsive compartment or 
compartments in primary adipocytes. Similar data were obtained using 3T3-LI adipocytes. 
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These methods can be used to, e.g., provide an enriched preparation oF glucose 
transport-related sequences, fn addition, in screening assays, a test compound can be 
incubated with the cells before isolation of the vesicles and the ability of the test compound 
to affect the localization of the glucose transport-related sequence determined. 

Example 5: Characterization of GLUT4 Membranes 

Two additional approaches were used to characterize the membranes resolved by 
equilibrium gradient centrifugation. First, each Fraction From the gradient was analyzed by 
SDS-PAGE and silver staining of the constituent proteins. This analysis revealed that most 
of the membrane proteins in fractions 7 and 8 were dramatically reduced when membranes 
were derived from insulin-treated adipocytes. Certain proteins in fractions 6 and 9 showed 
the same effect, whereas many did not. These results suggest that membranes resolved in 
Fractions 7 and S are highly puriFied insulin-responsive membranes, while those in fractions 6 
and 9 arc only partially purified. Membranes in higher density fractions show no detectable 
insulin-sensitivity in spite of the presence of significant GLUT4 protein. Many of the protein 
bands in the insulin-sensitive membranes are also present in the membranes that are not 
responsive to the hormone. These data are consistent with the hypothesis that the insulin 
sensitive membranes containing GLUT4 contain many of the same constituent proteins as 
other cell membranes that function in a hormone-insensitive mode. Thus, these proteins may 
also be targets for drugs that potentiate insulin action and ameliorate type II diabetes. 

To further characterize the GLUT4 membrane preparation, we determined the 
distribution of transferrin receptors, thought to be present in endosomal membranes, and 
VAMP2 (vesicle-associated membrane protein), thought to be associated with insulin- 
sensitive GLUT4-containing membranes (Kandror and Pilch, 1996, J. Biol. Chem. 
271:21703-21708: Kandror and Pilch, J 996, Am. J, Physiol. 271 :EI-EI4). Surprisingly, both 
of these proteins were present in the fractions that were responsive to insulin and their 
distributions were more restricted to these fractions than was GLUT4 itself. These data 
suggest that the insulin-sensitive membranes in these fractions arc contaminated by recycling 
endosomes, that transferrin receptor is present in the insulin-sensitive membranes, or both. 
The restriction of VAMP2 to the insulin-sensitive fractions is consistent with data showing 
that VAMP2 function is necessary for GLUT4 translocation to the plasma membrane in 
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response to insulin (Cain et uL 1992, J. Biol. Chem. 267:1 168 1- 1 16S4; Martin ct uL 1996, J. 
Cell. Biol. 134:625-635). 

Expression of transferrin and/or VAMP2 can therefore be used as part of a sysiem 
analyzing glucose transport, e.g., in diagnosing type II diabetes. 
5 These experiments provide an example of a method for analyzing glucose transport, 

e.g., in an individual with type II diabetes. In such a case, insulin-sensitive cells from the 
1 individual are cultured and analyzed as above. Alterations in the amount or distribution of 
vesicle proteins compared to a control (i.e., normal with respect to diabetes) indicate that the 
individual has or is at-risk for a disorder involving glucose transport. Testing cells from the 
10 individual that were cultured in the presence or absence of insulin provides additional 

information regarding hormone sensitivity (e.g., by examining the distribution of vesicle 
proteins in the presence and absence of hormone. 

Example 6: Identification of cytoskeleta) proteins in GLUT4-containing membranes 

15 To identify proteins present in the insulin-sensitive membranes containing GLUT4, 

the equivalent of fractions 7 and 8 were pooled, analyzed by SDS-PAGE and the gels silver 
stained, These results confirmed that many of the resident proteins in the membranes derived 
from insulin-treated cells were present at lower abundance compared to controls. Many of 
the protein bands, combined from both lanes, were subjected to tryptic hydrolysis and the 

20 peptides analyzed by mass spectrometry as described in Example 6. Of the proteins 

identified by this procedure, peptides derived from GLUT4 itself appeared in two closely 
spaced bands. Remarkably, the lower of these bands also contained a peptide corresponding 
to the phosphorylated form of the COOH-terminiis of GLUT4, indicating significant amounts 
of phosphorylated GLUT4 are present in insulin-sensitive membranes. In addition, peptides 

25 corresponding to several proteins previously reported to be present in these membranes were 
identified, including the IGF-H/mannose-6-phosphatc receptor. IRAP {insulin-regulated 
aminopeptidase). amine oxidase, long chain acyl-CoA synthetase, and SCAMPs (secretory 
carrier-associated membrane proteins). Two proteins not previously known to be present in 
insulin-sensitive GLUT4-containing membranes were also identified — vimeniin, an 

30 intermediate filament subunit. and a-lubulin, the microtubule protein. 
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Two approaches were taken to determine if vimentin and CMubulin are directly 
associated with membrane vesicles that also contain GLUT4 and are insulin-sensitive. In one 
approach, the membrane preparations obtained from the equilibrium gradient ccntriFugalion 
were analyzed by MALDI-TOF MS analysis. In a second approach, the Fractions were 
analyzed using immunoelectron microscopy using anti-GLUT4, anti-vimentin and anti- 
tubulin antibodies. 

MALDI-TOF MS Analysis 

Proteins resolved by SDS-PAGE were visualized by silver staining (Bio-Rud) and the 
bands were excised from one single dimensional 5- 15% gel. The silver stained proteins 
bands were destained and tryptically digested (trypsin) in gel according to Gharahdaghi et ul. 
(1999, Electrophoresis 20:601-605) with some slight modifications. The digested samples 
were further concentrated and desalted with Millipore Zip Tip CIS micro tips prior to 
MALDI-TOF (matrix-assisted laser desorption ionization time-of- flight) analysis. MALDI- 
TOF analyses were performed on a Kratos Analytical Kompact SEQ Instrument, equipped 
with a curved field reflection. Peptide masses were searched against the non-redundant 
protein database using MS-Fit of the Protein Prospector program developed by Clauser et ul 
(1999, Anal. Ghem. 71:2871-2882) at University of California, San Francisco. 
Fragmentation information obtained from individual peptides via Post-Source-Decay (PDS) 
analysis was searched against the non-redundant protein database using the protein 
prospector program MS-Tag. 

Immunoelectron Microscopy 

Standard techniques were used to stain the prepared vesicles with anti-Glut4, anti- 
vimentin, and anti-tubulin antibodies conjugated to colloidal gold particles. Most of the 
vesicles in the preparations show reactivity with anti-GLUT4 indicating relatively low 
contamination with membranes that do not contain the transporter. Anti-vimentin and anti- 
tubulin antibodies were used to detect vimentin and tubulin in GLUT4-positive membranes. 
A Iraciion of these GLUT4-posilive membrane vesicles also directly react with anti-vimentin 
and anti-tubulin. Non-immune antibodies showed no detectable staining of these membranes 
under the conditions of these experiments, while anti-GLUT4 staining was readily detected. 
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These results indicate that some GLUT4-eonlaining membrane vesicles are associated with 
the cytoskeletal proteins vimentin, a-tubulin, or both. 

To further assess association of vimentin and a-tubulin with insulin-sensitive 
membranes, the abundance of these cytoskeletal proteins was estimated using Western 
5 analysis jn each of the membrane fractions obtained by equilibrium gradient centrifugation. 
The relative abundance of GLUT4 protein versus vimentin and a-tubulin throughout these 
fractions was analyzed. Both vimentin and alpha-tubulin are present in all of the membrane 
fractions of the gradient except for the top few fractions. Strikingly, both of these proteins 
arc greatly reduced in abundance in the same gradient fractions in which GLUT4 is reduced 
to in response to the action of insulin. In membrane fractions of higher density, the 

concentrations of GLUT4, vimentin, and a-tubulin are all unaffected by prior treatment of 
cells with insulin. Taken together, these experiments demonstrate that two cytoskeletal 
proteins, vimentin and a-tubulin, are bound to subpopulations of the GLUT4-containing 
membranes that are insulin-responsive in rat adipocytes. 

15 

Example 7: Identification of Proteins Expressed in GLUT4-Containing Vesicles 

GLUT4~containing membranes were isolated by velocity sedimentation, then further 
fractionated using sucrose density equilibrium gradients, and, as described above. GLUT4- 
containing fractions that exhibited the most insulin sensitivity (peak I; fraction 7-8 and the 

20 fractions containing GLUT4 that were less insulin sensitive (when compared to the peak 

fractions) were identified. The biogenesis of the peak I vesicle fraction was also observed to 
increase during 3T3-LI adipocyte differentiation. To identify proteins present in GLUT4- 
containing vesicles, fractions corresponding to peak l from primary adipocytes, peak I from 
3T3-L1 adipocytes, and peak 2 from 3T3-LI adipocytes were pooled, subjected to SDS- 

25 PAGE and silver stained. The protein bands were subjected to tryptic hydrolysis and the 

peptides analyzed by mass spectrometry using standard techniques. Figs. SA-Sl arc a list of 
the peptides identified in peaks I and 2, as well as their GenBank Accession numbers and the 
Gcnbank Accession numbers of a human homolog if one is available. 

These proteins are useful as targets for compounds that modulate glucose transport as 

30 well as for diagnosis of individuals having or at risk for disorders related to glucose transport. 
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Example 8: Comparison of Muscle-Aclipocyte Union Library Sequences and GLUT4- 
Enrichcd Vesicle Sequences 

A comparison was made between (he glucose transport-related proteins identified in 
the sublractive and the Adipocyte Union libraries and glucose transport-related proteins 
identified in glucose transport vesicles. Fig. 9 lists those proteins that were in common 
between at least one of the libraries and were also identified in peak 1 or 2 of the vesicle 
preparation. Acetyl-CoA carboxylase, carboxylesterase, caveolin-I, CDC36, are listed in 
this figure although their presence in peak I or peak 2 is not confirmed. 

Example 9: Analysis of Gene Expression Using DNA Arrays 

DNA arrays can be used to assay the levels of gene expression of selected gene 
sequences. These were measured by assaying the amount of mRNA for the gene sequences 
selected for analysis in undifferentiated 3T3 Li fibroblasts and differentiated 3T3 LI 
adipocytes. The sequences selected for analysis are selected from the MAU library. Clones 
from the library that show significantly different levels of expression in differentiated 
adipocytes are selected for further analysis of their role in glucose transport. 

A protocol for analyzing an array follows. 

1. Clones that are previously sequenced are selected from the MAU library. These 
clones consist of known and unknown genes with various levels of expression in fibroblasts 
and adipocytes. 

2. Each of the clones is diluted 1:50 and then amplified by PGR. 

3. PCR fragments are gel purified and re-suspended in 20-30 \i\ of ddl-LO. 

4. Nucleic acid concentration of the PCT products is measured by spectrophotometer 
COD 26 i)) and further dilutions are made bringing all samples to a concentration of 100 ng/jLll. 

5. The PCR samples are then dot blotted (i.e., each to a separate address) onto a 
charged nylon membrane at 50 ng per dot as described in steps a - l\ 

a. The PCR samples are diluted to the desired concentration in 0.2 M 
NaOH/10 mM EDTA (denaturation solution) and then incubated at 37°C for fifteen 
minutes. 

b. The nylon membranes arc prc-wetted and placed into a dot blot apparatus. 
Suction is applied to the apparatus and buffer is washed through the openings. 



WO 02/33046 PCT/US01/49451 

c. After denaturation the DNA solulion is place in the apparatus (each sample 
in a separate well) and suction is applied. Once the solution has gone through the 
filter, the wells are washed with additional denaturation solution. The membrane is 
then removed from the apparatus and cross-linked with UV-radiation> Membranes 
5 are then baked to dryness and stored in sealed bags until ready for use. The 

membrane with the PGR sample is referred to as an array. 

(•). To analyze expression, the arrays are pre-hybridized for at least five minutes in 
modified Church's buffer (7% SDS, ImMEDTA, 0.5 M NaHP04 pH 7.2). 

7. Probes for the arrays are labeled in a modified first strand cDNA synthesis 
10 reaction as follows: 

a. Two labeling reactions are carried out side by side. One using adipocyte mRNA 
as the substrate and using fibroblast mRNA as the substrate. 

b. For each labeling reaction, 2 pg of mRNA is combined with 2 pi of oligo d(T) 
and 2 pi of random hexamer and incubated at 70°C for 10 minutes unci then 

15 chilled on ice. 

c. After the incubation, add 4 pi of 5X first strand buffer, 2p! of 0. 1 M dithiothreitol 
(DTT), and 1 of a modified cNTP solution (A, T, and G at 500 (LiM final; C at 
5 |liM final), and 5 pi of labeled dCTP. Mix, microfuge, and place at 37°C for 

2 minutes. 

20 d. Add reverse transcriptase (2pl Superscript II; Life Technologies Inc.: Rockville. 

MD), mix and place at 32"C for one hour, 
e. Place on ice to stop reaction. 
S. Unincorporated dNTPs are removed from the probe mixture by passing the mixture 
through a G50-I50 Sephadex column (Sigma) and centriFuging for 1 minute at 1000 x 
25 a. To the labeling reaction acid I pi 1% SDS, 1 pl 0.5M EDTA, and 3pl 3M NaOH 

and incubate at 6S l) C for three minutes and then at room temperature for fifteen 
minutes. 

b. Add 10 jut! of J M Tris-HCI pH 7.5 and 3 pi of 2N HCI. 

c. Add an additional 50 pi of TEN (lOmM Tris-CI. ImM EDTA, 100 mM NaCI, 
30 pH S.0) buffer to the tube and filler the labeled mix through a G50-G150 

Sephadex column to remove unincorporated nucleotides. 
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d. Add 50 |Lig of Coll DNA (Life Technologies Inc.: Rockville. MD) to this mixture; 
boil for five minutes, and hold at 6S°C until ready to use. 
9. The probe is added to a sufficient volume of modified Church's buffer and the mixture is 

added (o the filters (add approximately the same number of counts to each array) and 

hybridized overnight at 65°C with gentle rocking. 
10 After hybridization the filters are washed as follows; twice at room temperature with 

2XSSC/0.05%SDS for five minutes, once at room temperature with O.IXSSC/O.J%SDS 

for ten minutes and finally once or twice at 65°C with 0. IXSSC/0. 1%SDS for 1 hour. 
I I The damp arrays are wrapped in plastic wrap and put on a phosphor-imaging screen 

overnight (Filters may also be placed on auto-rad film). 

12 Commercially available programs for phosphor-imagers quantify images. Alternatively 
the images can be quantified with commercially available graphics or image analysis 
programs. The quantified values represent the relative amount of expression of each 
sequence on the array. 

13 The values are further analyzed by subtracting background from each measurement and 
the values are then graphically represented to facilitate comparisons between the values 
for fibroblast and for adipocytes. 

This method allows for screening of multiple sequences in a single procedure. Such 
methods are useful for analyzing expression profiles in individuals having or at risk for a 
disorder related to glucose transport, for analyzing the ability of a lest agent or a candidate 
agent to alter expression of a gene involved in glucose transport, and to analyse compounds 
that may be useful as drugs for other disorders for potential (deleterious) side effects 
resulting from unintended alterations in expression of genes involved in glucose transport. 

Similar methods of analysis using arrays can be used for diagnostic purposes. For 
example, expression of sequences encoding proteins involved in glucose transport can be 
analyzed using a nucleic acid sample from the cells of an individual suspected of having a 
glucose transport-related disorder (e.g., type II diabetes). In general, the nucleic acid sample 
will represent sequences expressed in a cell type that conducts glucose transport. The 
sequences analyzed include sequences more highly expressed in adipocytes and/or muscle 
cells than in fibroblasts (including sequences expressed in adipocytes and/or muscle cells and 
having no delectable expression in fibroblasts). Such sequences arc described herein. The 
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level of expression of the sequences represented in the array is compared to a reference level 
of expression (representing the amount Of expression present in an unaffected individual who 
is not at risk For the disorder). An alteration in the level of expression of one or more of the 
sequences indicates that the individual has or is at risk for the glucose transport-related 
disorder. The array may include one or more sequences that are used as standards (i.e., 
reference sequences) to normalize the data between reactions, fn general, the sequences used 
as standards correspond to genes whose expression is not affected in glucose transport 
disorders. Sequences used as standards can also correspond to genes that are not 
differentially expressed between adipocytes, muscle cells, and fibroblasts. Examples of such 
sequences are described herein. 

Example 10: Genechip Identification of Genes Not Expressed in 3T3-L1 Fibroblast, but 
Present in 3T3-L1 Adipocytes and Muscle 

To further identify genes that are preferentially expressed in cells conducting glucose 
transport, the mouse U74A Genechip (Affymetrix) was probed with two independently 
produced sets of probes from 3T3-LJ fibroblast, 9 clay old 3T3-L1 adipocytes, and mouse 
muscle. The experiments were carried out using standard methods, essentially as described 
above. The genes listed in Figs. 13A-13C are those whose expression was not detected in 
fibroblasts, and was detected in adipocyte or muscle on one or bolh of the duplicate 
Genechips based on the Absolute call of gene expression made by the Affymetrix Microarray 
Suite Software. The columns in Figs. I3A-I3C marked I I and f2 are data from the fibroblast 
replicate chips. The columns marked a I and a2 are data from the adipocyte replicate chips, 
and the columns marked ml and m2 are data from the muscle replicate chips. A indicates 
that the gene is absent in a tissue. P indicates that the gene is present in a tissue. An M 
indicates marginal signal and the software cannot determine if the gene is absent or present. 
The function classes of proteins listed in the last column are: Class I are genes encoding 
metabolic proteins; Class 2 are genes encoding signaling proteins; Class 3 arc genes 
encoding cytoskeletal or trafficking proteins; and Class 4 are other proteins whose function is 
something other than those of Classes 1-3; and Class 5 are proteins of unknown function. 
Genes in italics encode mitochondrial proteins. 
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Genes that are expressed in adipocyte and/or muscle and are not expressed in 
Fibroblasts are useful, e.g., for identifying genes whose expression is altered in disorders 
involving glucose transport, for detecting aberrations in glucose transport, and as targets For 
drugs designed to alter glucose transport. Genes that are expressed in both Fibroblasts and 
adipocytes and/or muscle cells are also useful as reference sequences, e.g., to normalize dala 
obtained when measuring expression patterns of genes expressed in glucose transport in a 
sample. 

Example 11: Probe sets on Affymetrix GeneChip U74A whose expression is increased 
in both 3T3-L1 adipocytes and muscle compared to fibroblasts. 

To determine the relative expression levels of genes in cells that conduct glucose 
transport compared to cells that do not conduct glucose transport, the mouse U74A GeneChip 
was probed with three independently produced cDNA probes from 3T3-L! fibroblasts, 9 day 
old 3T3-L1 adipocytes, and mouse muscle. The experiments were conducted using standard 
methods, essentially as described above. The genes listed in Figs. I4A-14G are those whose 
expression was determined to be the same on all fibroblast chips, and increased on both 
adipocyte or muscle GeneChips based on the difference change of gene expression made by 
the Affymetrix Microarray Suite Software when compared to the first fibroblast chip. The 
columns marked ft, f2, and F3 are fibroblast replicate chips. The columns marked al, a2, and 
a3 are adipocyte replicate chips, and the columns marked ml , m2, and m3 are the muscle 
replicate chips. NC indicates no change of expression. Ml indicates that there was a 
moderate increase in expression. An ] indicates an increase in expression. The function 
classes of the genes listed in the last column are as follows: Class I genes encode metabolic 
proteins; Class 2 genes encode signaling proteins: Class 3 genes encode cytoskeletal or 
trafficking proteins; Class 4 genes encode proteins with functions other than those of Classes 
1-3; and Class 5 are proteins of unknown function. Genes listed in italics encode 
mitochondrial proteins. 

Genes with increased expression in adipocyte and/or muscle compared to fibroblasts 
are candidate genes for a glucose transport pathway. Such genes arc useful, e.g., for 
identifying genes whose expression is altered in disorders involving glucose transport, 
detecting aberrations in glucose transport (e.g., lor diagnostic purposes), and as targets for 
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drugs designed to alter glucose transport. Genes whose expression is the same in Fibroblasts 
and adipocytes and/or muscle cells are also useful as reference sequences, e.g., to normalize 
data obtained when measuring expression patterns of genes expressed in glucose transport in 
a sample. 

5 In selecting nucleic acid sequences for the uses described herein, any of the genes or 

sequences identified using any of the above methods (i.e., subtraction libraries, vesicle 
proteins, or microarrays) can be combined. Particularly useful are those sequences 
corresponding to genes found to be preferentially expressed in adipocytes or muscle cells 
compared to fibroblasts in at least two of the methods. In some embodiments, the sequences 

10 are selected from those that are preferentially expressed in both adipocytes and muscle cells 
compared to their expression in fibroblasts in at least two of the methods. 

Example 12: Assay for GLUT4 transport/insulin mediated transport 

Methods are available for the rapid testing of the functions of proteins identified as 

15 glucose transport-related proteins, e.g., by assaying their role in GLUT4 regulation. For 

example, a reporter molecule that is a chimera of (he transferrin receptor (exofacial domain) 
and the IRAP (insulin-regulated aminopeptidase) protein that traffics in cells like GLUT4 has 
been described as a surrogate for GLUT4 (Johnson et al. f 2001, Mol. Biol. Cell 12:367-381; 
Lampson et al., 2000, J. Cell Sci. 1 13:4065-4076; Subtil et al., 2000, J. Biol. Chem. 

20 275:4787-95; Johnson et al., 1998, J. Biol. Chem. 273: 17968-17977). This chimera is 

expressed in cells and is sequestered in the perinuclear region under basal conditions. Insulin 
then stimulates the chimera's translocation to the cell surface. The translocation can be 
readily measured using an antibody raised against the exofacial domain of the transferrin 
receptor or by labeled transferrin itself. This assay is then applied to cells in which the 

25 protein of interest (e.g., a glucose transport-related protein) has altered expression. For 
example, the protein of interest can be overexprcssed in a cell that also expresses the 
translcrring/IRAP chimera, and the effect of overexprcssion on insulin regulation of 
translocation assayed. This assay can also be used to determine if a test agent or candaie 
agent targeted to a glucose transport-related protein is an effective modulator of insulin 

30 regulation of translocation. For example, the candidate agent can be a iibo^.yme or antisensc 
sequence that is targeted to a nucleic acid sequence encoding a glucose transport-related 
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protein, e.g., RabGAP orendophilin lb. Similarly, the assay can be performed in the 
presence and absence of a candidate agent targeted to a glucose transport-related protein or 
nucleic acid sequence. An alteration in transport of the chimera in the presence of the 
candidate agent indicates that it is a candidate agent, useful for treating a disorder associated 
with aberrant glucose transport (e.g., type II diabetes). 

Two examples of genes identified using the methods described herein that can be 
used in the assay methods described above are those encoding an apparent RabGAP and 
endophiiin lb. The RabGAP protein is predicted to be a negative regulator of Rab GTPases. 
which are known to promote membrane recycling of GLUT4 as it transits from intracellular 
storage sites to the plasma membrane and back into the cell. One such protein, Rab 4, is 
implicated in directing GLUT4 to its perinuclear recycling compartment, a necessary step for 
GLUT4 to respond to insulin. The RabGAP that was identified is predicted to inhibit Rab 4 
by increasing the GTPase activity of Rab 4 leading to its binding GDP and deactivation. 
Thus, RabGAP is an excellent drug target in that its inhibition might lead to promoting Rab4, 
a required element in the regulation of GLUT4 by insulin, Endophiiin lb is related to a class 
of brain endophiiin proteins that are involved in promoting endocytosis of plasma membrane 
proteins. The high expression of endophiiin lb in adipocytes indicates that it is likely to be 
involved in endocytosis of GLUT4 in these cells. Endophiiin lb is therefore another 
potential drug target in that its inhibition by a drug is predicted to retain GLUT4 at the cell 
surface membrane where it can promote glucose transport, thereby lowering blood glucose. 

OTHER EMBODIMENTS 

It is to be understood that while the invention has been described in conjunction 
with the detailed description thereof, the foregoing description is intended to illustrate and 
not limit the scope of the invention, which is defined by the scope of the appended claims. 
Other aspects, advantages, and modifications arc within the scope of the following claims. 
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1. A method of identifying a gene whose expression is altered in a glucose transport- 
related disease or disorder, the method comprising: 

providing a nucleic acid array comprising 4 or more nucleic acids immobilized on a 
solid support, each nucleic acid comprising a sequence of 10 or more consecutive nucleotides 
within any one of the sequences listed in Figs. 1 , 2A-2R, 3A-3E, 6A-6E, 7A-7U, 8A-SI, 9, 
I3A-I3C, and 14A-I4G or a complement thereof: 

providing a reference nucleic acid sample prepared from a tissue of a normal, control 
mammal; 

contacting the array with the reference sample; 

detecting hybridization of the reference sample with nucleic acids in the array, to 
obtain a reference pattern of glucose transport-related gene expression; 

providing a test nucleic acid prepared from a tissue of a mammal having a glucose 
transport-related disease or disorder; 

contacting the array with the test sample; 

detecting hybridization of the test nucleic acid with nucleic acids in the array, to 
obtain a test pattern of glucose transport-related gene expression; and 

comparing the reference pattern with the test pattern to detect a gene whose 
expression is altered in the test pattern relative to its expression in the reference pattern. 

2. The method of claim 1, wherein the array comprises 10 or more nucleic acids 

3. The method of claim 1, wherein the array comprises 100 or more nucleic acids. 

4. The method of claim I, wherein the array comprises not more than 100 nucleic 
acids. 

5. The method of claim I, wherein the array comprises not more than 200 nucleic 
acids. 

6. The method of claim I, wherein the array comprises not more than 300 nucleic 
acids. 
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7> The method of claim 1, wherein the sequence comprises 30 or more nucleotides. 

8. The method of claim I , wherein the reference nucleic acid and the lest nucleic 
acid are cDNAs. 

9. The method of claim 8, wherein the cDNAs comprise a fluorescent label. 

10. A nucleic acid array comprising 4 or more nucleic acids immobilized on a solid 
support, each nucleic acid comprising a sequence of 10 or more consecutive 
nucleotides within any one of sequences listed in Figs. I, 2A-2R, 3A-3E, 6A-6E, 
7A-7U, 8A-8I, 9, 13A-13C. and I4A-14G. 

1 L The array of claim 10, wherein the array comprises LOO or more nucleic acids. 

12. The array of claim 10, wherein the array comprises not more than 100 nucleic 
acids. 

13. The array of claim 10, wherein the array comprises not more than 200 nucleic 
acids. 

14. The array of claim 10, wherein the array comprises not more than 300 nucleic 
acids. 

15. An isolated nucleic acid molecule comprising a nucleotide sequence selected 
from the group consisting of SEQ ID NOS: 1 -3, or a complement thereof. 

16. A nucleic acid molecule of claim 15, consisting of a nucleotide sequence selecled 
from the group consisting of SEQ ID NOS: 1-3, or a complement thereof and a 
non-nucleic acid modifying group bound to either a 3' or 5' end of the nucleotide 
sequence or both. 

17. A nucleic acid molecule of claim 15. consisting of a nucleotide sequence selected 
From the group consisting of SEQ ID NOS: 1-3, or a complement thereof, and a 
synthetic nucleic acid sequence bound to a 3' or 5' end of the nucleic acid 
sequence or both. 
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18. An isolated polypeptide comprising an amino acid sequence encoded by a nucleic 
acid sequence selected From the group consisting of SEQ ID NOS:l~3. 

19. An isolated nucleic acid molecule comprising a nucleic acid sequence selected 
from the group consisting of SEQ ID NOS:4-93, or a complement thereof. 

20. A nucleic acid molecule of claim 19, consisting of a nucleotide sequence selected 
from the group consisting of SEQ ID NOS:4-93, or a complement thereof and a 
non-nucleic acid modifying group bound to either a 3' or 5' end of the nucleotide 
sequence or both. 

21. A nucleic acid molecule of claim 19, consisting of a nucleotide sequence selected 
from the group consisting of SEQ ID NOS:4-93, or a complement thereof, and a 
synthetic nucleic acid sequence bound to a 3* or 5' end of the nucleic acid 
sequence or both. 

22. An isolated nucleic acid molecule of claim 19. consisting of a nucleic acid 
sequence selected from the group consisting of SEQ ID NOS:4-93, or a 
complement thereof, . 

23. An isolated polypeptide comprising an amino acid sequence encoded by a nucleic 
acid sequence selected from the group consisting of SEQ ID NOS:4-93. 

24. A method for identifying a candidate agent, that modulates the expression or 
activity of a glucose transport-related polypeptide, the method comprising: 

a) providing a sample containing a glucose transport-related polypeptide; 

b) adding a test agent to the sample; 

c) assaying the sample for expression or activity of the glucose transport-related 
polypeptide; and 

f) comparing the effect of the test agent on expression or activity of the glucose 
transport-related polypeptide relative to a control, wherein a change in glucose 
transport-related polypeptide expression or activity indicates that the test 
agent is a candidate agent that can modulate expression or activity of the 
glucose transport-related polypeptide. 
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25. The method of claim 24, wherein the test agent is selected from the group 
consisting of a polynucleotide, a polypeptide, a small non-nucleic acid organic 
molecule, a small inorganic molecule, and an antibody. 

26. The method of claim 24, wherein the test agent is selected from the group 
consisting of an antisense oligonucleotide and a ribozyme. 

27. The method of claim 24, wherein the glucose transport-related polypeptide is 
assayed using an antibody. 

28. The method of claim 24, wherein the glucose transport-related polypeptide is a 
human glucose transport-related polypeptide. 

29. The method of claim 24, wherein the method comprises the step of determining 
whether glucose transport is modulated in the presence of the test agent. 

30. The method of claim 29, wherein glucose transport is decreased in the presence 
of the test agent. 

31. The method of claim 29, wherein glucose transport is increased in the presence of 
the test agent. 

32. The method of claim 24, wherein the assay is a cell based assay. 

33. The method of claim 24, wherein the assay is a cell-free assay. 

34. The method of claim 24, wherein the glucose transport-related polypeptide is 
selected from the group of polypeptides encoded by sequences comprising the 
nucleic acid sequences listed in Figs. I, 2A-2R, and 3A-3E. and the polypeptides 
listed in Figs. 6A-6E. 7A-7U, 8A-SI, 9. I3A-I3C, and 14A-14G 6-9. 
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35. A method for identifying a candidate agent that modulates expression of a 

glucose transport-related polynucleotide, the method comprising: 
a) providing a sample in which a glucose transport-related polynucleotide is 



expressed; 



O 



b) 



adding a test agent to the sample; 

detecting expression of the glucose transport-related polynucleotide; 



d) determining the amount of expression of the glucose transport-related 
polynucleotide; and 



glucose transport-related polynucleotide in the sample relative to a control, wherein a change 
in the amount of expression from the glucose transport-related polynucleotide indicates the 
test agent is a candidate agent that can modulate expression of the glucose transport-related 
polynucleotide. 

36. The method of claim 35, wherein the test agent is selected from the group 
consisting of a polynucleotide, a polypeptide, a small non-nucleic acid organic 
molecule, a small inorganic molecule, and an antibody. 

37. The method of claim 35, wherein the test agent is selected from the group 
consisting of an antisense oligonucleotide and a ribozyme. 

38. The method of claim 35. wherein the glucose transport-related polynucleotide is 
a human glucose transport-related polynucleotide, 

39. The method of claim 35, wherein the method comprises the step of determining 
whether glucose transport is modulated in the presence of the test agent. 

40. The method of claim 39, wherein glucose transport is decreased in the presence 
of the test agent. 



e) 



comparing the effect of the test agent on the amount of expression of the 
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41. The method of claim 39, wherein glucose transport is increased in the presence of 
the test agent. 



42. The method of claim 35, wherein the glucose transport-related polynucleotide is 
selected from the group of sequences listed in Figs. 1 , 2A-2R, and 3A-3E-3 or a 
complement thereof, and listed in Figs. 6A-6E. 7A-7U. 8A-8I, 9, 13A-I3C and 
14A-14G, or a complement thereof, 

43. The method of claim 35, wherein the assay is a cell-based assay. 

44. The method of claim 35, wherein the assay is a cell-free assay. 

45. A method of diagnosing an individual having or at risk for a glucose transport- 
related disorder, the method comprising: 

(a) providing a nucleic acid array comprising 4 or more nucleic acids 
immobilized on a solid support, each nucleic acid comprising a sequence 
of 10 or more nucleotides, the sequence comprising or containing a 
sequence selected from the group of the sequences listed in Figs. 1. 2A- 
2R, and 3A-3E, or a complement thereof, and the sequences of the genes 
listed in Figs. Figs. 6A-6E, 7A-7U, 8A-8I, 9, 13A-13C, and 14A-14G, or a 
complement thereof; 

(b) providing a nucleic acid sample from the individual; 

(c) contacting the array with the sample from the individual 

fd) detecting hybridization of nucleic acid in the sample from the individual 
with each nucleic acid in the array, to obtain a pattern of glucose transport- 
related gene expression; 

(e) comparing the pattern of glucose transport-related gene expression in 

sample from the individual with a reference pattern, wherein a comparison 
of the pattern of expression in the individual compared to the reference 
pattern indicates whether the individual has or is at risk for a glucose 
transport-related disorder. 
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46. The method of claim 41, wherein the array comprises 10 or more nucleic acids 

47. The method of claim 41, wherein the array comprises 100 or more nucleic acids. 

48. The method of claim 41, wherein the array comprises not more than 100 nucleic 
acids. 

49. The method of claim 41, wherein the array comprises not more than 200 nucleic 
acids. 

50. The method of claim 41, wherein the array comprises not more than 300 nucleic 
acids. 

51 . The method of claim 4L, wherein the sequence comprises 30 or more nucleotides. 

52. The method of claim 41, wherein the sample from the individual is a cDNA 
sample. 

53. The method of claim 48, wherein the cDNA sample comprises a fluorescent 
label. 

54. The method of claim 48, wherein the disorder is type II diabetes. 

55. A nucleic acid array comprising 4 or more nucleic acids immobilized on a solid 
support, each nucleic acid comprising a sequence of 10 or more nucleotides, the 
sequence consisting of at least a portion of a sequence selected from the group 
consisting of the sequences listed in Figs. 1 , 2A-2R, and 3A-3E, or a complement 
thereof, Figs. 6A-6E, 7A-7U, SA-81, 9, 13A-13C, and 14A-14G, or a complement 
thereof. 
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Novel Sequences from Clones in the Muscle-Adipocyte Union 
Library 

Line number 259 
>c0148 

CCCCAACCTGCTCCATTGCTTGGGGGAGCGGTCCATGAGCGCTTGTCTCATCCCT 
GGCCTCCCGGGAAAGTCTATGCAAAAGCTAAGGTTAACA (SEQ id N0:l) 

Line number 258 
>c0827 

TCACAGAGGCTCTGAGGCTACCACGAAGATGAACTCTCAGAAATGGGATTGTCA 
CCCTCGATGAGTTTCCAGTTCCCTCTCTGTTGTATGATGACACAAGAAGGTGAAG 
TGTTGCCTCTCTACAACTGGAAGAGGGAGA ( S EQ ID NO: 2) 

Line number 260 
>cl083 

GTACTAGCGCTTACAGGTCTGTGTGCAGCCATGCCCAGCTTTCTAAGTGGGTGCT 

GGAATCTGACTTCAGGTTTCATGCTTGAGCAGCAAAGCCCTCTTACACAGAGCCA 

TTTCGACAGTTCTGTGACTTAGGTAGACTCACATCTGTCAGGCTAGAATTTCCAA 

AATTGAAAATGAATTCAAAGTGAAATGCTTGGGAAGTAAGTTAAAGATAGGCTA 

AATGGTTAACCCAGCAGTTAGGGTTGCTTTCTGCTCTTCCAGGGGACGTAAGTTT 

GGTATTTTCTAGCACCCACACTTGGTGGCTCAAGCCCTCTAACTCCAGCTGCAGG 

GGATAGGATGCCCTCTTCTAACCTCCACTGGTGAGAATATCTCACACACACACAC 

ACACACACACATGCTCACATACACAACCT (SEQ ID NO: 3) 
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Muscle Adipocyte Union Library 

Number: 172 
>C0025 

GTACACTGACTTGCCCCGTGGGACCATCTCTGAAAACCAGGCTGTGGGGGCT 
GGAGACCCTGCCTCGGATCTGAGAGACCAGGCTTCCACATAAAAACGAGGTG 
CTAATCTGGCCTTATTAAGTCTGGGCCCAAGTGTTTTCTCTCTCAATAAAATG 
ACTTAAGGTAAAAAAAAAACCTC (SEQ ID NO:4) 

Line number 229 
>C0039 

GTACTATGAGCTGAAGATTGCCTTCGTCATTTGGCTGCTGTCGCCCTACACTA 

GAGGGGCGAGTTTAATCTATAGAAAGTTCCTTCATCCCCTGCTGTCATCAAAG 

GAAAGGGAAATTGATGATTATATTGTCCAAGGCAAAGAAAGAGGCTATGAG 

ACAATGGTGAATTTTGGACGGCAAGGTTTGAATTTAGCAGCTGCAGCCGCCG 

TCACTGCAGCAGTGAAGAGCCAAGGAGCAATAACGGAGCGTCTGCGAAGTTT 

CAGCATGCATGATCTGACAGCTATCCCAGGGGATGAGCCGTGGGGACACAGA 

CCT ACC A GACTTTG (SEQIDNO:5) 

Line number 240 
>C0076 

GTACAGTCCATGCTCATCTGAGAAATTTACAGACTACAGTTGGCCAAGTTCCT 

CACCATATGGTCATTATACCTTCCATAAGATTTTGATTCATGCTTACTTTTCTG 

TATCCATTTCTGGCAAACAAGTTACTTGTATCATGACACAGGAGGATTTTAGT 

TAGCTCTCTTACACACTATTTTATTGATGCATTATGAGATTTAATGACTATGA 

AGGGGAATGATAATTCTAGTTGGCCATCATTGGCAGCACTTACTACTAAAGT 

GGAAGTGAGACATTTGGACTGTATATCTGTTTGGTATGTTATATAACTACTAA 

AATTGAATGGTGGCAGAAAGGAATGAA (SEQ ID NO: 6) 

Line number 191 
>c0089 

GTACCCACGGGATGAAGAACCTTTCATTTCCCCCCCTTTTCCTTTTCTTCCTTG 

TCGCTGAACTGCTGGGCTCCAGCATGCCACTGTGTCCCATCGATGAAGCCATC 

GACAAGAAGATCAAACAAGACTTCAACTCCCTGTTTCCAAATGCAATAAAGA 

ACATTGGCTTAAATTGCTGGACAGTCTCCTCCAGAGGGAAGTTGGCCTCCTGC 

CCAGAAGGCACAGCAGTCTTGAGCTGCTCCTGTGGCTCTGCCTGTGGCTCGTG 

GGAGATTCGTGAAGAAAAAGTGTGTCACTGCCAGTGTGCAAGGATAGACTGG 

ACAGCAGCCCGCTGCTGTAAGCTGCAGGTCGCTTCCTGATGTCGGGGAAGTG 

AGCGTGGTTTCCAGCACAGCCACCCGTTCCTGTAGCTCCAGAGATGTTTGATG 

TCCTCCGGTCTCTACAGGCACCTGCACTCACGTGCGCGAATCCACACACAAG 

CACACATAGTTAAAAATAAAACAAAACAGGCTCAAAAAAAAAAAAAAAAAA 

AAAAAAAAAAA (SEQ ID NO:7) 
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Line number 254 
>c0095 

GTACTTCTGCATGCACTCTTGCATGGCCCGGAAACTGGTCTATACAGTCTGAC 
CCCTTGATATCCTCTGTTGCTGTAGTGGAAGCAGGAGAATGCATACTTGAACT 
GCTCCCCACAGGGGCCGCTGGCCATTCCCCCAAGACATGGACAATTCCAGTT 
TAATATCTCCGTTAGGCAGTATCAACCCGTTGCTCCTCATAGGG (SEQ ID 
NO: 8) 

Line number 200 
>c0103 

GAGGTACAACATTGTCGGCTTGCGCAGCAACGTGGACTTCCTGCTCCGGCTCT 

CCGGCCACCCAGAGTTTGAGGCTGGGAACGTGCACACGGACTTCATCCCTCA 

GCACCACAAGGACCTGCTGCCGAGTCACAGCACTATAGCCAAAGAGTCTGTG 

TGCCAGGCAGCTCTGGGGCTCATCCTCAAAGAGAAAGAAATGACCAGCGCTT 

TTAAACTCCACACTCAAGATCAATTCTCTCCGTTTTCATTCAGCAGTGGGAGA 

AGACTGAATATCTCTTACACCAGGAACATGACTCTGAGAAGTGGTAAAAGTG 

ATATAGTCATAGCTGTGACGTATAACCGCGATGGGTCATATGATATGCAGAT 

TGACAACAAGTCC (SEQ ID NO:9) 

line number 1 93 
.>c0121 

TGACTCCTCTTACTTGTGAGAAACAGGCACAGAGAAGCCCTGATGCGGTGTC 

ACACATGGATGGCAAGGGGCTGAACTAGGTCTCCTGAGGGGGAAAAAGAAA 

CATCAGGCAAGGCAACCGTCTGCCTTCACCACATGACTCCTCTTGGCAACCC 

AGTGTCTGGGTTGTTGTAGGGAATTACTTTAAGTTATCCAACAAGCCCTAAGC 

AGAGGGTCTGTTTCCTTTAGCCTAGATCCTAGAAGAGGCTGGGCTCCACTTGC 

CTCCTAAAGCAAGTTTGTCCTACTCCTTGTCTTCTGCTTGATTTCTGGAACTGG 

GCTTTGTTTCAAGCATGTATAAAGTTTTGGCCTTACTCCCACCCAAACCCTTTT 

ATTCC ACCATTCTACAAATTAAGGGAGG (SEQ ID NO: 10) 

Line number 195 
>c0124 

GTACTGAGCTTGCCCTGTGCCTTGTGGGGAAGGGAGTGAGAAGGCAGCATAA 

GAGGTCTTGCCAGTTGAGACAGAAGCAGGTAGGTGCAGGAACTATTTTGGAT 

AGGGCATGTTGTTCAGTTGATGGCAACTAATTCTCTTAAGAACATATTGTCTC 

CTATTAATGTTGGGGACAAACCCAGAAAGGGAAGGGTATAGAGATGGGGAA 

TTGGTGAGATCTTCCAGCCCTGAGACCATGTCTGAGCCACCCCAAACTTAGG 

CTATCTCCCAGCATAGGAGTGAGACACTGCCCAGTTTACTCAAATTCTAGTGT 

AC (SEQ ID NO: 11) 
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Line number 1 1 3 
>c0139 

GTACCAAAATATGGACAGAAAAGATAATGAAGCAGACCGAAGTGCTGTTGC 

AGCCAAATCCAAATGCCAGGATAGAAGAATTTGTTTATGAGAAACTGGATAG 

AAAAGCACCAAGTCGTATAAACAACCCGGAACTTTTGGGACAATATATGATT 

GATGCAGGAACTGAGTTTGGCCCAGGGACAGCTTATGGAAATGCCCTTATTA 

AATGTGGAGAAACACAGAAGCGAATTGGAACAGCTGACCGAGAGCTGATTC 

AAACATCAGCCTTAAATTTTCTCACTCCTTTAAGAAACTTTATAGAAGGGGAT 

TACAAAACAATCGCAAAAGAAAGGAAACTATTACAGAATAAGAGACTGGAT 

TTGGATGCTGCAAAA (SEQIDNO:12) 

Line number 244 
>c0152 

GTACATGAACCAGATGTATTTCTCAGCTTTACATAGGGGAAAGGGAATTAAA 

AAAATACGCAATTGCCCAGCAAATGCAAATGTTTAAAAAGGAAATGCAGAG 

AGAACTATGGGAATGGAACAAACAACAGACAGAACTTCAAACAGTGAAAGA 

AAAACAAACAAAACAACCAGAGGGAGAAAAACACAAAAGATCTGAAATCCA 

CCAATCGCTTTTTGAGCTGAATGGGGGTTGCTTTAAGACCAGAAGTCAAAGT 

CACTGCTGCTGGTGGTCTGCCACGTGGGGGTAGTTCACCTAATTCCAAATAGC 

TGGCCCTGCTTCAGGGCTGGGGCACCC (SEQ ID NO: 1 3) 

Line number 208 
>c0196 

AGGGGTGTAAACAGTAAACTGCTTTATTGAGACACTGTTACAACGATTCCTTT 
GTTACACAGTTTTAAAATATTTTATAACACTCTTCCTGGGGAGAAGTTAAAAT 
CTGAGGCTTAGTTTAGACTGCTGGGAAATATACAATGTAC (SEQ ID NO: 14) 

Line number 196 
>c0222 

CAAGTTCTTCAGTTACAACCTAGTAGTATACTTACTCTTCCAACTGTCCTAAG 

GTCACTTCCCAGCCAGCTTAGGATCTTCAGCATTTTTAAGAGCTGAAGCTCCC 

TCTTGCCCTTCTTGTCTACTCCTCACTGCCAGTTGGGGCCTAGGCTTAGTCCTG 

GGCAAATGTCCATGATCTTGCTGCTGTAGGAAGCTTGATAGGGCATTTGGCTC 

AAATTTCAGAAGGCCTCGCTCCTGACCTAATTTCTCAAAGCTCCGGTAGTTCT 

AGAACCCTCCAATTTCTCATCTGGTTGCAAGGCTTATTTTTCTTTT (SEQ ID 

NO:15) 
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Line number 243 
>C0236 

GTACATTCAACTTCTGCCCTGTAATTCGGCCAAGTAAGGCCCATATCCCTTGC 

CCTTCACTTCGAAGTTTCCCATTCAGATTTTGCAGTTCCTCTAATGATTCACAG 

AGCTTATTATATTCTACATGAAGCTTTTCTTGCTCATTCTCTAATTTATCCTTT 

ATATTCTTTTGTTCAGCCATATTTTCCTGAATTTGAATCATGCGCATATTTCTC 

TCAAAATAGTAATTCACAAAATCAGCAGTGAGAGCGGGCTGCTTATGCTTCC 

GCCTTCCATCCACACTAGAACTAGTATCTGAATTTGTCCACTGGAAAGATATC 

CGT (SEQIDNO:16) 

Line number 203 
>c0250 

GTACAAATGTCCAATGCGTATTTTGTGTTTGTCTTAAAAATTCATCAACTCAA 

CCTGAATTTAGGAATATACTTTAATAAAGCATTTGTGCGACACAGAAGCGAC 

ATGTCTGATAAACTCATAGGAGGATTGTCCCTTGAGGGGAGGGACCATCTTG 

AGAATTCACAGTAATAACAGCTATACTGACTGTTAAAACAAAAATTACTGTT 

CAATACATGTCTTATCTCCCTACCCTACCCCCGGAAAACAGGAATGGGG 

(SEQIDNO:17) 

Line number 167 
>c0352 

GTACAAGGACCAGCTCTTGAAAGAGACAGTGCTCCAGCCACTGCTGCAGCCA 

CAGATCATGTCAGCATGAGTAGTCGTGCTGAAGGGAAAACACAGAATGCTAT 

CTTAATGACCATGCCAACATTATTGAATAGCCGAAAGTCCCTAAACCCACTCT 

CTGCTGCCTTATCAATGCTAAACCTTATTTGTCTTCATCAAGAGTAGTTCAAA 

ATATGCAACTAATTTAATAATTTTGAATGATGGTTTTATCTATAGCAATCTGT 

AGTAATATGTATATTATCTATTGGGATTTGTGTAATAAAAAATCTAAGGGAAC 

AAAATTTTATAACTACAAGCACTTAAGTAC (SEQ ID NO: 1 8) 

Line number 251 
>c0367 

GTACATATTACCCCGTTTCCTCCTCAATAGAATTGCCACAAACTGCATGCTAA 

ATTTAGTTCTTCTGGACAGACCACAACCCTAAGGCTAGTTCTGCTATGTCATA 

TATGAGTATTAAATATGGTATGCTTAGTATACTCCAGCCTAAGATAGTTAACC 

ACCTGAGACCAGCTGTGATGTTCGAAGACATACAGGATGAGGTTTTCTTTCAC 

AGGGTTCTGAGCATAGTTTCTGTCCCAGGAATATTGTCTTATCTCCATAACTA 

TAGCTGATGCAGAAAGTCCAGACAATATACTCATTTCGACTCAGAATATTTCA 

AATTTAGCAATAAAGAGTTAGCTTTAGTT (SEQ ID NO:19) 



WO 02/33046 PCT/US01/4945J 

6/77 

Figure 2E 

Line number 197 
>c0380 

GTACATCTGAATTACACTGTAGGCTCACAAGCAAAAACGTGGTCCCATGGTG 

AACAGATGACATCTCAGAGTTCTAAATCTCCACTTTATTTTTAAGATTAGAAG 

TTCTCAAAGCTGAGGCTGTAGCTTGGTGGATAGAGAGAGCACTTGCCTAGCA 

TTGACAAGACCTCAAGTTCCATCCTCAGCACTTGGGGTGGGGTTATTATTTAT 

ATCACTACCATGAAATTATATTGTTTTCATAAAGAAAAATACTCCTAAAATAG 

ACTGGGTAAGGCATTGTGGCACACTGTGTTCCTAAGCCACCCAGGAATCTGTT 

GAAGG (SEQIDNO:20) 



Line number 236 
>C0439 

ACCTTGTGCGTTTTTGACCCACAGCCTAGAGAAGGCAGCCACTGCTGGTTACA 
GATGGCTTAGTGCAACACAGTGGAGGTAGCAAACATGTAAGGTCCCTAGGGG 
CTGTGTGACAGCCTGGAAAGTAAGACAGTGGGGACAAACGCAGGGAGGGGC 
CAGGCAGCCATTCTTTCTCACCGCCTCCTTTAGAATGTTCCCATAAATTACTA 
TTAAAAAAAAAAAAAGTATCCATAGAATACTGAGAGGA (SEQ IDNO:21) 

Line number 199 
>c0442 

GTACTCACCTAATCACTTGTTATCCAGTGCCTGTTCTAGGTTTATGGACTTAA 

CTATTTCTGTGATGTTTCATTTTTAGCCATGTTAACTCCTAACACATATTCTCT 

TATGTCTCAGTAAAGTTTCATTTGATAAGTTGTTGAGATTCTGTTATTTGATAA 

TATTCTTCGGCTGTCCATCCAGCATCTTAATCACTTTAAAACTGTGATTGTTAT 

TTGCAACTCTGTTCTTTGGAAAGAATAAAAGCATTTTTTTTCACTTGCTAACA 

TGCTCACAAATGTGAGAGAAGAGTCATTAAAAGCTTTACTTACTGGGAAAAA 
AAAAAAAA (SEQIDNO:22) 

Line number 198 
>c0443 

GTCCTTGCCCATTACCAAGAAGTATATGCCAGAGAATAAGGGCGTTCCTCTG 
CAAGGGAGCCAGGAGGACAAACCCTTCCCAGACTTTGACCCCTGGTCATCAT 
ATAATTGTGAGCAGAACGAGTCATAGCCCATCCCTGCCAACTGCACTGGCTG 
TGCCCAGATATTACCCCTCAAGGTAACGCTGCCAGA (SEQ ID NO: 23) 

Line number 210 
>C0504 

GTACATGTGCACTAGGGACACACTCCATAAAGGCAGCCATGCTGGGAAAACA 
CTTAGCTCCATAATGGTGCCACTCAACTCCAGCTCGCTCTTTTTCGTTCTGATT 
GACCGTGATGAGTTGAAGGGTGAGAAACACGGCACCTAGGTGCTGCATGTTT 
CTGACATTCTCATAGTCTGTAGAAAATAAAAGTTTATCACACCATCTAAAAA 
AAAAAAAAAAA (SEQIDNO:24) 
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Figure 2F 

Line number 250 
>C0507 

GTACAGAGAAGCGTGTGCGACCAACCTGCAGTTACAGCTTTCTTCACATTTCC 

CCATTCCACACAGCTGTTCCAATGAGCGATCTTGCCCTGGTCACACCACTTTT 

AAGCACACTCACAAAACAACCACATATACAGGGTTTTCACAAATATGCAAAT 

AAACAGGGTTTTCAGAACCAATTTTTACAAAGTGTTATCTAGTAAGATGACCC 

AGGGCCCAGTCTCTTTTATTTCCCAAACTGCTTTGACTGACAGTTTAACAGCT 

TCCCCTCTGCTCAGGGATGGTTTTTAAGCACCATTCAAGCCAAACCCACAATG 

GATGTATTCAGTGGCAAAGCACTCATGAGGTTTCTACAAGCATCAGAATGAT 

CTCATAGCAAGCGCAAGACCCTGTCCCCTTGTCACAATGGCTGATTTGAAAG 

GTTAAATAAAGCAAACGCTGTTGTCTTCTGGCCAATTCCAGAAATTCAAGATG 

TGTTTTGAAGGCATGGCTCACACATTAAATTCCTGTATGTGTTGGATAAGGCA 

TGACTTCCATATGAAATTCCTTTTAGATAGCAGCATTTGGAATTCTT (SEQ ID 

NO:25) 

Line number 233 
>C0513 

GGCTTGTGAGGGTGACTGGACTGTCCAGTGTATGTCTGCTGCTTGTTCCACTG 
ACTCTACAGAAACAAAACCCGGTGGGTGCTAGACAGTGCTGGCTCTGCTTGC 
AAGCCATGTTGCTGTGGTTGTTACTTATGTGTCTGGTCCAAATAAAGGCAGCT 
GCTGATTTGTTGTTAAAAAAAAA (SEQ ID NO:26) 

Line number 234 
>C0533 

GTACAGTTTCATTTAGAAACTTTGTGACCTTGCCCTATTGTTGGGTTTTCTTAT 

TTGTCCCTCCAGCTACTTTGTAACAGCCAGAGGTGACCCTTGAAGGAATCCTG 

AGAAAGGAGCAAACAGGAATGGAGCCTAGCCACGGTCCCTCAGTTCGGCCTC 

AGGGGCGTAGTCCTTCATTGGCTGCATTTTTCTTTGTGCTGGATCACACCCTTC 

TGGATCAGATCGGGGACTTCCACTGCCAGCCAAGGACCCAGCCCCAGGGCCA 

TGAGATGAGCTAGTCCAAACTTAGGCACATTCCTGGCCTACAAAGGTTTGAA 

ATGATCAGTCAGACATATTTTGCCACCCCTGTAC (SEQ ID NO:27) 

Line number 23 1 
>C0536 

GTACTGCTTCCTGCCAAGAGTGAACGGATCTGTTCTCACTGTGACCCTCAATT 

CTCTTCCAGAATGTTCAGAACAAGAGTAAAGTGCGTCACACAGAAAGGAAGT 

GCAGTGTGAGTGTTCATGAGTTTGGATCAACTGAGGTCTTTCTTACAAAAAAA 

TGATGAAATGTAGCAACCTTTACATTATATTCAATTGCCAATCATGACAGTAT 

TTCAGCTCTACCTTCATGATAAACTTTTGTTACTAAAATTTATATACACACAT 

AAGTAAAGTTTTAAGTATTTTATGCTGTTTAATGTTGTGTGGCTTTTATACAAT 

CTACAGAATAAAGAAAACTAAC (SEQ ID NO:28) 
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Line number 230 
>C0538 

GTACAATATAACAACATCCATATTAGAATGAAAATCGGAGTATTTAAAAGAC 

AACATTAAATCTCTCTTTTTTTTTTTACAGCTTGTATTTACAAAATGAATCAAA 

ATATTTTTTTTCTTTAAATTCAGGACTCAATAAAATAAACAGCTGGGAATATG 

CAATCTACGACTGTGTAGAGAGGACTGAAAAGAAAACCCAATGTCAGCGAA 

AAGTGGGGTGAGACTCTAGTAAAGATCATTAGCAAAGTTTTAGGTCCATACT 

ATGGTTTTTAGGAATCCCATAAATGGTTTGCATTTCTAGTGTGACAGAAAAAG 

AAGCTTTACATCCTGT (SEQ ID NO:29) 



Line number 238 
>c0564 

GTACAAAGGATCCAGAAAAGGGGACATTTTAAAAAAATCACTGTTTAGGCCA 
CCAAACCAGGGGAATCGTGGTGTTACCATCCCATTAGAGTCCCAGTTTACAG 
GAAGACACACGAGACAGAGAGCAAAGCCAGTCTCTGTGCTTAGCTACTGTGG 
GGGACTGAACACATGTAAAGAAGTCCTACATGTCTGCTGGTCCTTTCCCACCA 
GCCCCCATCTTCAGGTCTCCTGAGCTGAACTTCAAT (SEQ ID NO:30) 

Line number 253 
>c0572 

GTACCCAGGTCACTGGCTTCATGTTCAACAACTTCTACCTACAGTAAAAGTTA 

AAATTAACACCAGGTATCAGCATATCTTTTATAATACGTGCAAATCACATAGT 

TTTCCCCATTCTCCCTGCTTTGATACATACACAAGCTGCTACAAAAATAATGC 

GAAAATGTTGAGACCAATCTACTATACTTCAGAAATCTCATCAACCCAATAT 

ACAAAGTCAAGATAAAAAAGCAGGCTGTTAAGTTATTCCTGGTTCTGACCAT 

ATTATTTCTACAAATCTCATGTTCTCAAATTAGAGCTATAAGTTCCAGCCAAT 

GTCTCTGTTGTGGGAAAAGCATAGTCCAGCACTGACGTGAAGGGAGGCAAAC 

CTGTCCCTTCAATGTCAAGCTCAACGAGGGTGGCGG (SEQ IDN0:31) 

Line number 249 
>c0580 

GTACCCTGAACCTTGAATCTCAGAAGGAAGTTCAGGCGGTGGGTGCCTCCAC 

CCCCATCTAGCCTTCCTCACTGGTTCCCGGGGCTGTCAGCGCGGCTCTCAGTC 

TTTTTGCACTAGGGCGGGCAGAGGTGGGATGCGCTCCTTAGATGCCTCTCCAG 

TATTCAGTGTCCACACTTCCTGGGATAAATCCCACCTCTGGGTTTTCCTTTTTG 

GGGAGAGGGGCAGGGCAGGAATTTGAAGCTGTCACTTCCTTGTGAAGTGTGA 

ACACAGTTTAGCCATAGGCAGTTGTGATGTCCAGTGTTTGGGATCTATGGGCA 

AGTAGTTTTCCCAAAGCAGGCATTCAGTTCTGGAGCAGGCTTGACTCTGGCCA 

AACCCTGAGCCACTGTGAATTTGCTGGTGGTCTTCCGACTTACTGGAGTGATT 

TTTTAGGTTTTCCAAATCTTATATGACTCTTTTTTTATCTCTAGCTGATAGAAC 

CGCATGTAATTAGGTGTAC (SEQ ID NO:32) 
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Line number 255 
>c0582 

GTACAAGAGTTAACAAATTACAATTAAAACGGAATTTGTTATGGCAATTCCA 

CAGAACTTAAAAACATGCAACACTGCATGGTAAAAACAGCTTCATTCATATA 

CAAAAAAATATTCCTTTCCGACTCATGACAGTGCCTGGAAACTTCTGACATAA 

GCTTTTTGCAAAGAATATTTTTAAAAATGTAACGATCTGATTAACCAATTAGT 

GCAGTATTAGGAAAGATAATAAACATTATTAGTAAAGAGGTTACAGTGATTT 

ATACCAGGTATAGACAGGGCTCAATGTAGTCTCATTAAATAAATGTTCAGTT 

AAGAAAATAGTTTTGAAAAAGATCTTATATTGAAGCCATGTATTAATTTTGTT 

GAATCAGCTTATATAAATCAATAGTCAAGTTTATTCAGTTAAAGAAAATAGG 

ACTATGCTTTCTTATACTCATAAATAGTAC (SEQ ID NO:33) 

Line number 194 
>c0591 

GTACCGAGCTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCGGCT 

TCGAGCGGCCGCCCGGGCAGGTGTGAGAGGGAACAACTGCCAGGGAGCTGT 

TCCAGGGACCACACAGAAAAAGGCCTCGCTAAAGCAACAAACCTGATCATTT 

TCAAGAACCATAGGACTGAGGTGAAGCCATGAAGTTCTTGCTGATCTCCCTA 

GCCCTATGGCTGGGCACAGTGGGCACACGTGGGACAGAGCCCGAACTCAGC 

GAGACCCAGCGCAGGAGCCTACAGGTGGCTCTGGAGGAGTTCCACAAACAC 

CCACCTGTGCAGTTGGCCTTCCAAGAGATCGGTGTGGACAGAGCTGAAGAAG 

TGCTCTTCTCAGCTGGCACCTTTGTGAGGGTGGAATTTAAGCTCCAGCAGACC 

AACTGCCCCAAGAAGGACTGGAAAAAGCCCGAGTGCACAATCAAACCAAAC 

GGGAGAAGGCGGAAATGCCTGGCCTGCATTAAAATG (SEQ ID NO:34) 

Line number 205 
>C0596 

GTACATAGTGCAACTTCATCACAATGTCGAGGTTTAACACACATTCAGTGCGT 

GTTTTCTGTGCAAGTTCTTAGTGGTCTGGGACCTAGTTGAAGGTATTCATTTG 

CAGATCCACGAGATTCAGGAGGTTTGAAGAAAAGAAAAGTTTCCTTGATTTC 

TTAAACCAGCACTGGGTGTGCAAAAGGCAGCGGGCAATCCGGCTGTCAACTG 

GAGCCAGATCAGGGCGACGTTAAAAGTTAAGATCCAGCCGCGCATCTCGGCC 

TCTGGGCTGAGAGCGAGCTAAGTGAGATCCAGGCAGTTCGTTACTTAGACTG 

G (SEQIDNO:35) 
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Line number 209 
>C0600 

GTACCAAAAGTCTTTATTTTTGGATGTTGCCTGAGATGACCTCTTTCCCCACTC 

TCTGATTCTCATTCTTCCATGACACTCAGTTCCTCTTGCATACTCCCAAAGATG 

GCAAGGTCTAACCATACATGGATCCTCTGGCTTTGGTTTCTGCGTTAAGTAAT 

ACTGGTCACTCAGCCTCCATCTTGGAGGAGGATGTTTTCCTATACCACTCCTA 

AAGGTAGTGACCGCACACCAAATCTTAGGTCTCCACTGTCTCAAGGGAATAT 

AAAGACCCCAGGGTGGGCTGCAGGTTGCTTTAAACAAGGGGACTACAAGGA 

CACCCACAGGTTTCTGCTCTACCCTCTGGCTTGGACAGCAACAGAGAGGCTTC 

ATCTAAACTTAAGAAATTCTTTTCTACAAGGCAAGCTGTATCTGCTACCACCA 

AC (SEQ ID NO:36) 

Line number 245 
>c0635 

GTACAAGAGATGCACACTTGGAAACCTTGTAGTTGTGAGTAACATATTTATA 
CCTATCTTAGTGGGCTTCAGAGAAACAACACGTTATGTATAATAAAAATGGA 
GAATTAATAGTTTTTCCAGGTATAAATGTAC (SEQ ID NO:37) 



Line number 213 
>C0681 

GTACAAAGGTAACAACTTAGAATCAAGGCCCGATGTAGGGACTATCATATCT 

GCCCTATTTGCCAAAGGGAAACATGATAAGCTCCCACCGGACACGGACAAGT 

TACTGAAGGGGAGAGCAGGTAGCTTTCCACGTTAAAAGCATCTAAAAATAGC 

CCTTATGTAATGGAGCCAGGGGAGGGGTGCACCAGATTAAATCTGGAAGCTA 

GAACTGGTAGGCTGGAGCTGTCCTTCTGGAAAATTTGGAAATTTTATAACTAG 

CAATCAGACAACTTTATTCTTAAGCTTTATTTAAGTGTGGTAGGAACAGCATA 

CGCTATAGTGTAC (SEQIDNO:38) 

Line number 212 
>C0684 

GTACATTCCATATTGGACAAAACTTATACTTTGAGTTTCATATGAAATTGAAA 

ATTGTATTTTTTTCACTGTTTCATTTTTACACATCTCTATAAGTATTATTTACA 

ACCTTGAATAAATAAGCAATAGATAATGGCGAGTAAATCAAGTCACAGTAAA 

TACAGCTGTTTTTAAATATTACCTTTAGCTACTATTGTATATAATTTAGAATTC 

ATTTTTTCACCGACCAAGTGTTTTACTATTTCCAATCAGTGTTGCCTGCAAAG 

ACTTTTGTTTATGTGATGTAC (SEQ ID NO:39) 
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Line number 2 1 1 
>C0689 

GTACTTTGGGTTTGGGGACCTTGAGCACTGACGTGAAACTCTAAGGAACCTG 

AGCCAAAGAAAACTATTCTCTCAGTGCTGGAACTCAGCGAGCACATGTGTAT 

TCTGTGCAAGATGCGCATAAGTCGGGAGAAAGCTTTTACAGAGCAACGTTTC 

AATCCTGACCAGGGCTCCAACAGACAAGATGTATGTCACGGCGAACAAGCCA 

ATAAGCAGCACCCGTCATGGTTTCTGCTTCGGTTCCTGCCTTGAGTTCCTTTG 

AGCGTCTCTCAATGGAGTGTGGCCAGGGATATGTAAGCCAATAAACCATTTC 

CTCCCCCAAGGGGTTTCAAGTCTTTATCTTAGCAATAGAAACCT (SEQ ID 

NO:40) 

Line number 241 
>c0692 

GTACAGTGTGATAGGACCACTGTCCTGGCAGAGCAGGACTAAAGGAATGTAG 

ACCCCATCTCAGTGTCTTCAGAAGGACTGTGAACAGAGGCAGCGTCTGCCAT 

GAAACCAGTTCACCGCTCCTCTCCATGAATTGACCTTTTCAATTTTTTTAAAGT 

TATTTTTGAGACAGAGCCTCATGTAGCCCAGGCTGGCCTTGAGCTCTGTATGT 

ATCTGAGGCTAGTCCTGAATTCCTAACTCTCCTGAGTGCTGAGATTGTCAAGC 

ACTGCATTAAACCTGGCTGGTCTCTTGTCTTGACTTCCTATATCTTTGGCTCCT 

TCTGTCTCAGAATGATGGGAAGCCTGACGGGGAC (SEQ ID NO:4I) 

Line number 257 
>c0710 

GTACTTCCCATGTCGGTTAAAGTGTTTGTAGAGATAGCTGATCCGTTTGTTAA 

GGTTATGCAGGTCCTCAATGAACATCCTGTTTCCCACAAATTTCTTCTTTCGG 

ACACAACGCTGCAATTCATTTCCTCTCCAGCCTCGAAGCCATACTGGGCCCTT 

GATCAGTTCTTTTGGGTGCTTCTCGAAATTCCCCAGGATTCCAATGTTGTCGT 

AGACGCCAAACAGAGCCCTCTTCTCGTTCCAGCGATCAACCACTTTGGGAGG 

TGGAAGAGTGAGCCTTACGAAGGCCAACCGTGGGACGCCCAGAGAGAAGCT 

TCTGCATGGCCGAGGGCACCCAGCCCCACAGTCTTCTACTTGCCCGCCAAGG 

GACACTTCCTGCCATTACTCTAGCCTGTCCGCTCAGTAGGGGCAGCAAACCA 

GCCCGGAAAAATGGGGAGCGGGGTGTGAATTTCAAAGCCCAGACACAGATT 

ACAGTCCCCAGGCGGGGACTTACACGGTAGTGACAGTAAAAAGGAGCCATTT 

T (SEQIDNO:42) 

Line number 232 
>C0724 

GTACGTGGAGCATAAGCAAGGAACTCGGGGCACTGTGAAGCGGGCAGCTCCT 
CGCCAGCTGGCCCCAGTCAGATACACTATGGTGATCAGTTCTCTTTCTGCGTA 
GATGACAAGAGAAAGATGAGCCATGCTGACAGAGGATGGGGCGGGCGGGTC 
CAGCCACCCGTGAGACAAAAAGGCTGACCTCCATGTTGCAGGATTTGCT 
(SEQ JD NO:43) 
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Line number 192 
>c0729 

GTACCCCAATTCCTCTGCATCCTGTGGTGATGGGGGTGACTGTGAAGCCCTGC 

TGGTCAGGCTTCCTCCAGTGACAGGGGCGGAGTCTGACGAGACTGGCCTAGC 

TCTCTACAGAGGGAGCCCCGGTGCTGTCATGCAGGGACATGACTCTTAACAT 

GACTCTTACAGGCTTCACAAGACAGTCTTAACTTCTGATCATCCCACCCCTGG 
GTTTTAGTGTTAG (SEQ ID NO:44) 

Line number 252 
>C0739 

GTACTTGCTGCCTTCCTCGCGGAGCCGCGTCCTGATGACCTCGTGTGGGTAAG 

CGATGCAGGAGGCACATCCCTTAGAAACAGCAGCAGCTGCCATGAGTCCAAA 

GAAGCCAGAGGAACTTTTCTCAGCTCCATCTGTGGAGGAGACGATCGGAGCG 

TCTTTCAAACACTTCTTTAAGCTCTCATAAATAGCAAAGCAGATGATTGTCTC 

CGAGATCCCAGCGTAGGAGGCGGTCAGCCCTCTATAGAAGCCGCGGACGCCT 

TCTGTCTGGTAGACACGTCGAGCACACTGGAGTGTGTTCATCTGCTTGCAGCC 

CCTCACCTTGCGTTCTAGCTGCATCCTCGTTTTAACCATCCAAATAGGATTCA 

TTAAGGTATTTGTGACAAAAGCTGCAGAGCCAGCTGAGAGAATGTGCACAGT 

ATTGCTATTAGGCACGAAGATGCCATTGAATTGCTCTTTGGCTTTTGGAATAA 

CATGCAAAGTAC (SEQIDNO:45) 

Lined number 214 
>c0748 

GTACTGTTGCACTTCATCCAGAAGTCCCCGGGTGACTATGTTCCCCAGGAGAA 

CATCCTGCAGGCCTTGGGTATCTCTGCAGAGGTTTGCTCCAGCAAGCCACCCC 

AGTGTGATGAAGAGGTGTGTGGGAGGTGACACGCACAGCCTGAGGACCCGG 

AGGCATCTGTAACCTTGGATGTGTTTGGAACACAAAGAATCCCTGTGGAGCC 

ACTGAGCCAGGACGTTGGACCTTGTTCATCTTTCAACAGCAATGAGCCATTAA 

AGTGCAGGTCCTGGGCAAA (SEQ ID NO:46) 



Line number 204 
>c0762 

GTACCCAGGCCCCTTTGCAAGATTTAATCTTGATTATCTCCTTTTATGTTAGAG 
GAGATATAGTATTTAGAACACTCTAAGATATGTTTTATGTTCCTGAAGCTATC 
TTGCTCATGTTTAAGATACTAATTGTTTAAATGTAAGTTTTCAATAAACGTAT 
TTCTTTAGAGCTGCC (SEQ ID NO:47) 
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Line number 1 85 
>C0784 

GTCTGAAATATGTCTGAAAGAAGCATTGGAATTGTAAAATGCCCTATGAATT 

TGCCAATCCTTAGGTGGAACCAGATAGCCTTTGATGCATATAAAGGACAGCG 

GATCCGCTGCACTCCAGGTGCTGCTCAAGCCTCCTAATCAGGTCTAATTGAAC 

TGCTGTGGACAGAGTGTGACGTCTGCTCAGGAGGAGAGGGTTTCACTGACCT 

GGCTTTTAACAGTTCAAAGCAACAGGAGCAAAACACCCAAGTTGCGCATGAA 

TCTGGCATGGTGTCTGATAAATCCTTGTGTGTGCCCCAAATGTGAAGGTTATG 

GGCTATTCATTGTTTAGCCAGCTCTTGACATTCACCTTGGGAGTATGTTTGGT 

AATCACTTGGAGACGCAGGGGAATAGTCAGTAATGTAGCATTTCTGTTGTAA 

TTTCAATTTAACTATCTAGTAGTAC (SEQ ID NO:48) 

Line number 227 
>C0785 

CCCCACTGAAGGCTGAGTGAAGGGCAAAAGGTGAAGGCCTCTTGAACAGCC 

CCAGCTGATGCCCATTTTGCAGCAGTTATTTTTTGATTAATTTTGGCAGCCTAT 

GGCCATTGTTATGACGACGACTTGAAGAACCTGACTAAGTATAACCTGACTA 

AGTGTTCCCTCCAAAGGGGCATGTATCTTTCCATGGGGCAGGCCCTGCCCTGT 

CATGCCAGCCAGCGTGCACCTTGACACTGGTGCCGTGGGCAACACAGACATT 

CAAGCCTACTCAATATGACAATTCCTCAGAAATGAGGACACTTAGCTGACAC 

ACAAAGCACCTTTTTATACAGATCAATGTTTTCTCATTAAACTGACTCTATTC 

(SEQ ID NO:49) 

Line number 207 
>c0793 

CTAGTTTATAAATAGAAGTTTCCAGCATCCTATTTGGAAGATTTTCTTTTCTAA 

TACTTCCAGGCATTGGATGATTTTAAAGATCTGTTGGTAGTGACAGTAAGGTT 

AGATGCTGTTCAAGACAACCAAGTGACTTGAGCAGAGACAACCAGGAAACG 

GGTATTCACTAAATACTCTCTTCCTCACCTCAGGTTTGAAGGGAGGATCTGCT 

TCAGGTGTAGTCTGGAAGGTTCCTTGGTATGGCTCATGAACACAGAGCTCAA 

AGATACCGTCTTGAGAAATCCTCCTTGGTAC (SEQ ID NO:50) 

Line number 246 
>c0794 

GTACCTTTGGAAGAGCAGTCAGAGCTCTTACCTGCTGAGCCATCTTGCCAGCC 
CAGTTCTTTTTCTTTTTAAGCATCTTTAATCCCAGCTTCACTTTCTCTGTCAATC 
ATCCTTAGGCCTTATACAGATCTTAGTGGAATCCACTCTATCAACCACATGGA 
GAGAGAAAGCTCTAAGAGCAGATGTAATTGGCTATGTTCAGAGCAGTCTGCT 
CATGGCAGCTGTAGGACACAGAGAATGGCTATATTTTCCTCCTTTGTAC (SEQ 
ID N0-.51) 
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Line number 247 
>c0823 

GAGTCCGGGCGATATAGATCCAAGGCTTGCCTAAAACTCTGTTTTGCTGAAA 

ATGATCTTGAACTCCAATCATTCTGTCATAGCTCCCAAGTGCTTGGATGATAG 

CTTGTGTCACCATGCTTGGCTGTGTGGGTTTCTGTGGTCTGGGTGGGTTGTTTG 

GTTGGTTCTTCAATTGGTTGATTTAATGTAGCCCTGGCTAACCTGGAACTCAG 

TATATGACCAGGTTGACCTTGAGCTAGTGGCAATTCTCCTGCCTCTGTCTGTG 

AGTGCTGATGCTTTTGTGAATACCAGCTGCCTGCTAGGCAGAAAGAGAATGT 

TAACTGAGGTTGTTGCTTGGAGTTTTACATTTTTAACTTTTGGTAC (SEQ ID 

NO:52) 

Line number 256 
>c0840 

GTACCCCAGGTGAGCATAACCTTGGGGGCCAACAGTCTGAATGTATCTTTCTC 

CCCATTTAAACCTGAGCTGCCTAATGCACAGTTGGGTAAGGGGTGGGGACCC 

AGGGCTGAATTGTGGATCCCAGGGAAAGTGCTATGCACAGGAGTGATTTGTC 

ACCTCTGGAGCCTGGCTCCTCCTTGGTCTTTGCCATATAAGCACTTATACAAA 

GAAGCTCTGGGGTGTTGGGGCTCCTACTGCCTTGGCTGCCCTCTCTCTCCCCT 

CCCTGCCTGTGTCGCTTCTCTTTGTCTGCTGTCCCTGAAGGAGATGGCCACCC 

TTTGAGCTTAAGGGAGCTGTTTTAGGCAGAGCCTCTGCTTGGGGCTTTTCTGG 

AGCTGCTGATGGATAGTATCTGTCCAGGGACCTTGTTTCAAAGGGAATGGGC 

ATGGGCCAGCAAGGCACCCCCACAGCTATCAAGAACGTTTTCTTGTTTTTAAA 

CCATCACGTCTTCATTTCACATTGGAATAAAGTGAGTTTTTGCAAA (SEQ ID 

NO:53) 

Line number 215 
>C0846 

GTACTTGGGAAGCTGAGGCAAGAAAACCAAGGTCGAGCCTAGCCCAGGCTA 

CATAGTAAGAAGCTGCCCCTCCCCCCACCCCTTCCAAATCAAAGTCTCAGGA 

TGCTCGTTAGAAGAGCGCGCTTTCCAAGCATGCAGAAGGCTATTTTTTCCTCC 

CTCACAGCAAATTATTACATTATTACTGGTTTAATTAATTGTAAAGAGAACTT 

AATGAAACCACAATGGACACCACGGACTTACAGGTGATTCTCGTGTTCTGCA 

CCTGGCTCCCAGTCACATGCCCATGAGTAAGCAATCTAAATCAGCAGTTCAA 

AGTGATACGTGACTATCAGCACGGGCGAAGTCCACTAAATCTCCTAGAACTC 

TTTAAGATGAAGGAACTATGAGAAGGTAfTTTAAAACACCTCAGAGCACTAA 

AAAGATCACATTTTAAAACAACATAATCAAAACTTCACATTTTTCCCTGATAT 

TCTTAACTAAAAATCTTCTTGTTTTTTCATAAATATATTCACAAGAAC (SEQ 

ID NO:54) 
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Line number 202 
>c0859 

GTACCCGCGCCATGGGCCTCAGCTCCAACTCCGCTGTGCGAGTTGAGTGGAT 
CGCGGCCGTCACCTTTGCTGCTGGCACAGCCGCTCTCGGTTACCTGGCTTACA 
AGAAGTTCTACGCTAAAGAGAATCGCACCAAAGCTATGGTGAATCTTCAGAT 
CCAGAAAGACAACCCGAAGGTGGTGCATGCCTTCGACATGGAGGATCTGGGG 
GATAAGGCCGTGTA (SEQ ID NO:55) 

Line number 217 
>c0884 

GTACCAGTAACAGCTATTTTTTAACCAAAGTTACTCCAGAGAATTAGAGAAA 
TATGAAGTTGTCTGAAATAATTTATTGAAAAGTAAGATTTTTAAAATCCAATC 
TATCCTGGTCTTACATTTAATTAAAAACCAAATGATGAAACACCATTTCAGAC 
TTATATACATCATGATCCAATTCTTAGAAAAATGTAC (SEQ ID NO:56) 

Line number 216 
>C0901 

GTACAGATGAGGTAATTTACAACACAGAAGTTGTTACTCTGTAAACCCTGAC 
CCTCCCTACCCCCCACCCACTCAGATCTTTTAAAACTCTCCACATTTCTGCAA 
TGACTCCTTTTT (SEQIDNO:57) 

Line number 220 
>C0935 

GTACAATATTAACAGTTAGAGAACTGTTATTCAGAAAAGGTGTTAAACATGA 

ACAGCAGACTCGGGCTGACTGTCAGCTGTGGAAACATAAGAGGTAACATCAG 

TTTAAGTGAAGGCGAAGCCTTGGTCCACAGGCAAGACTCACACGAGGACCAC 

AGGACTCAAGAACTGGGAAACTTACGGAGTGTGTGGGCACTGGTCAGTGAGT 

ATGGGGCTTGTTCAAAGTTTATCTCCTGATCTATAAAATTATTATTGACATGA 

TCTAAAACAAGACCAACTGGGAATGATTGTCATCAAAATCTAGAAAATTCTA 

TTAGGAACAAACTATGGAGCCCAAAGTTAATTGAAGAATGGATACTTTCCTA 

GGCAGAGTTTTCAAGTGTATTTTCCAAGCAACATCACACAGACTCATAGGCA 

ATGATGCAATTTTTAAATAGACAAGATTTTTTTCCCCCTCAATACCTCAGAAC 

TTCATAGACATTGTGTTGGAGAATCTGGTCACAAAA (SEQ ID NO:58) 

Line number 239 
>c0943 

GTCAAGACCTAGGACCTAGCATCCCAATTTCAACCTGCACCCCTCATTACATA 

AGACTTGTTTTAAACCACGCCGATTACCCACTAATTGGCTCTAAATGGGTCAT 

GTGCACTTGTGGATTATCTAACAAGTGGAGACACAGAAGAAACCCTGGTGCA 

GGCCAGGCCGGGAGCAGGGACAGTGTTGGCAAGCCAGCTTGTGAGTGTCAG 

ATGCTTGGGCACCACGGATGTGAAAGGTGCGCCTGGTGCAATGTATGTGTTT 

GGTTAAAGAAACTCTCTGAAATTACTGTTATAATAAGTTTTTAAAAGATTTTT 

CTTTCTTTTTAATTTTCACCTTAACTCTTAAATAGGGTAATTTCAATGACCTAG 

ACTCTTAGAAAAATTTGACTTACCCCACAACTGACATGTTTCTTTTAGAGCTT 

TTGTAAACACAAAATCCTAGTGTAACTT (SEQ ID NO:59) 
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Line number 201 
>c0974 

GTACTAGTTTATAAATAGAAGTTTCCAGCATCCTATTTGGAAGATTTTCTTTTC 
TAATACTTCCAGGCATTGGATGATTTTAAAGATCTGTTGGTAGTGACAGTAAG 
GTTAGATGCTGTTCAAGACAACCAAGTGACTTGAGCAGAGACAACCAGGAAA 
CGGGTATTCACTAAATACTCTCTTCCTCACCTCAGGTTTGAAGGGAGGATCTG 
CTTCAGGTGTAGTCTGGAAGGTTCCTTGGTATGGCTCATGAACACAGAGCTCA 
AAGATACCGTCTTGAGAAATCCTCCTTGGTAC (SEQ ID NO:60) 

Line number 235 
>C0983 

GGAAAACTATTTTATGCAAAAAAAAAAAAAAAAAAACTATTTTCATCAGTTA 

TTGCCTCAGATACGAGAGAAACTATTTCTGGATGCTGTATTGTCAGGTTTTTA 

AGTGGGTATTTTTCTTTGAAAAAGAAGCTAAGTGTCTTCACCCACTAGGGACT 

TGAAGAAAATTTGAAATAACCCCATATACCAATGATTTCTATGTTAAAGAGC 

CTGTTTACACATCAGAAAGCCGATTTTGTATGCATATGAAGGGCCCTTTGGTT 

ATGAGCAGGGAGCCATCCAGATGGTCTTTGGTTACTATTTATTGGGGAACGA 

TCTGACACTATTACTCTTTTATTTAAGAAACAACGTTGATATGATCAAAGTAT 

GACCACTAGAGAGGTTGAGGGGACATGCTTAAAGAGCATAAGCTGACGCCAT 

GAAGGGAACACCATACTAGACTTTTTT (SEQ ID N0:61) 

Line number 228 
>c0991 

gtacagagaagcgtgtgcgaccaacctgcagttacagctttcttcacatttcc 

ccattccacacagctgttccaatgagcgatcttgccctggtcacaccactttt 

aagcacactcacaaaacaaccacatatacagggttttcacaaatatgcaaat 

aaacagggttttcagaaccaatttttacaaagtgttatctagtaagatgaccc 

agggcccagtctcttttatttcccaaactgctttgactgacagtttaacagct 

tcccctctgctcagggatgggttttaagcaccattcaagccaaacgcacaat 

ggatggattcagtggcaaagcactcatgagggttctacaagcattacaatga 

tctcatagcaagcgcaagaccctgtccccttgtcacaatgactggtttgaaag 

ggtaaataaaccaaccctggtgtgttttgggccattccagaaaatcaagatg 

tgttttgaa (seqidno:62) 
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Line number 219 
>C0996 

GTACAGAGTCAAATCACAAAACAAAGGGTAAACGTAACTGCTCAGTCCCTGA 

GCCCCAGCACACACACGCACACATACATGCTTACACTTGCTCACAGATTCAC 

AGGCAGTCAGCTACAGATGTTAAAGTCAAATAAAGATTTTTTCTTTTCCTGTC 

AGGTCCTGGGAAGAGGGGAAAAATGCTCACTGAGAGACACTGGGGCTTCTCA 

ATGCCACTCTCTGCTCTGCCAACAACCGAAAAGACAGAAACACTGGGGAGTT 

AGCTGTTTCCTTGGCACAGAACTGAATAGGCAACAGAAGGTTAAAGATTATG 

TTCACCTTGGGTTTAGTCTCTTTTAACACCACAACAGGGTAGGTAGCAAGCCA 

GAAAAGAAAATGTCAGCTCTATTAATTCCAGGAAAAAGAGAGAGAAGGATG 

TGAAAACTTGATCCCTATTCACCA (SEQ ID NO:63) 



Line number 248 
>cl023 

TTGTTTGAGGATTTTATTTCAAGACAGCATCCAGAATACGATTCATTTAAAAG 

AGCAAAATTAAGTATTTTGGCTCATTTTACCCTTCAAAAACTCGTAGCAATGC 

AAATGCAAACCGCAAACCTCCCAAACGTGCCTCTCTGTATAGCTTTTGGAAAT 

GAGAGTTCATTGAAAATATTTTTCCATTTTCAGATGAGATGAGATTTGTTGGG 

AAATCTTTGTTACAACTTGGGCATCTTGTGGGGAGATAAACTCTCGGACTGTC 

CCAGAAACTCCTGGATTGATCTCATCTTCCTGGCTTCCTGAGCCCTGGAAGAA 

CAATCCAAGAAGACCTGGATCTTTTTCCTGCACGCCTTCGTCGCGGAATTCAT 

TCTGCTTCCAGCAAGCCATCATCATCGACATCTTC (SEQ ID NO:64) 

Line number 218 
>C1032 

gtacttttgatggcttagctcacattaaaagtctgattagatttttcaccaaa 

tcttctcttcaaagatcacattctcatagtagtcataatctgaaaatagattt 

tttttccttgtcctcaaagatatctctttaagcccaccactactagaattacac 

ctagggcactaagaatcgcacatatgaattattgcatagtgtgcaaggtcta 

ggaatatgatttctatcttcaggcactaccttcgataccctgccttctcaaaa 

aaggtggccactagaaggtgattcatcaggtaaaatgctttccctgcaaccc 

ttacagcctagatttgatctctgtgacccatagaaggaggaaagaaccagag 

ctgtgttcagcatgtgagccatggcatggtgaagctcatacataccacaaac 

aataaaagactttttaaatcactggc (seq id no:65) 
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Line number 237 
>C1071 

TGGGGACGGCGGCTGCCCTCACCTATGGCCTTTACTGCTTCCACCGCGGCCAG 

AGCCACCGATCCCAACTCATGATGCGCACCCGGATCGCTGCACAGGGTTTCA 

CGGTCGTAGCCATCTTGTGGGGTTAGCGGCATCTGCTATGAAGTCTCAAGCCT 

GAGTATAGCCGGGTCTTAAAGCGCCATGGAAACCATTACAAAACCCAGGAAC 

AACAGACATCCCTGTCAGACTTGCTCCCTCCGTTTTAGACCGGACCTTATTGT 

CATTTGGGTGAGGAAGTGGGCCGATTTTGTAACTGATTTGCGCTTTCACCGGT 

GCCCCCTCCCGCTCCCAAAATCCCAAGGTTTCATTTCAGTTGGGGTTGCATGC 

TTCTATTTGTGGAAGGCGTCCCTTAATTTACTTTAATAAAAAGCTTATTACAA 

CTTGGTATTGT (SEQ ID NO:66) 

Line number 221 
>C1099 

GTACCAAAGGTCAACTGCAACTCTGCAGGGCAGAAAAGCCTGTCTTCATGCC 
TTTTCTCCCCTCACATGGAGGGGTGGCTAAGTCTGCAACACTCTAGAGGAAA 
CAATAATGTGTTTAAGATTGTCCTGTAGGTGTCTTCCTCCCTCTCCCTGGCATC 
CCAACCACCGACATAGGTGGGTGTATAGTAAGCGTCTTTCAGAGATGACATT 
TTGCTATTCTCCTTTTGGGGGGAAAAAAAAATCAAGCAGAGAAAATTTGATA 
AACTCCACAGTTTTAGTGTTTGTGAAAAACAAGCTGTTCCTTATTTTGTTCTCA 
AACTCTTATTTTTCCCAATAAGTCATCATGTGTTGCTGACAACAACTG (SEQ 
ID NO:67) 

Line number 242 
>clll2 

GTACATATGCACCAAATTCCATTTTAGAAGTTTCCATATCATTTTCATAGAAA 

ACAAAGTTTGAAAACAAGTTAACATTTAAACACAGCACGGTATTCTATCACA 

ACTGAAACTTTTCTTCTTTACAGGACTCAACAAAATCTAAAAATGAACTATGC 

TGTAGATTTACTTCATGCAAAGATCTTTATGTTATCTCTGAAAATGAAAAGAA 

TGGCTTTAAAAGCACATTTTATACTATTATGGCAACTTGTGTAC 

(SEQ ID NO:68) 

Line number 222 
>c!144 

AGACAGTGCTGCCTCCGTGGTGGATACGGTGGGATCCAGCTAGGTCATCTGA 

AAGATTAAGTCCCAGGATCTCAGCCAACAGTGACCATGCTTCCAGGGAAACT 

GCTTCCTGCACTTGCAGAACCACCAGTAGGCTGCAGGTGACTAACGTATCTG 

CAAGGTTAGAGATGCTCGGTGTTTCATACCCTGCGTTGGTCCTAGAGAGAAG 

GGCCTTTGGCATTGCTTTGGGCAGGTAGTGGGAGAAGAAAGGGATCCTGGGT 

GGGGTGTATTTATATATAATTTCGGTTTTGTTTGTAC (SEQ ID NO:69) 
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Line number 223 
>C1148 

GTACAGATTATCCCTCCTGAGACGCTGTGCCATCGGCCAGCACTCTCACTCTT 

CCCACACCTTCCCTCCTGTCCTGCTACCCACTGTTAGGCTCTCAGTGTCTCAG 

ACATCTGTAAGCTGCTACTTCATCTGCAAAGAAAATAGCTGTTCTTTTAAAAA 
TTAAACTATGTAC (SEQ ID NO:70) 

Line number 224 
>C1176 

GTACTTTTTGTTCAGTCACAGAGACATGCCACCAGCCTGTTCAGACCAACTGA 
GTCAGGAGAAACATACAACATGTGAAGACTATCTCTCTCCAGCTGGTGACAC 
AAAACAGCCCAGGTTCCAACCACAGGATTATTAACAATAAAAACTACACCGC 
TAGCCTGGCACGGTGGCCCGGACCCATAATCCCAGCACTTTGGAAGCTGAGG 
TAGAAGGTCTGTGGCCAGCCTGGGCTACAGAAGTGAGACTCCTGTTGCAAAA 
AGGAAAACAATACACACAAATTGTAACCCTGCTTTTTCTAAGGAAAAAGTAG 
TTTTTAAAACAAGGGGCAAGAAATAAACTTCAAACTGAGACTCATATT ((SEQ 
1DN0:71) " 

Line number 225 
>C1201 

GTACAGTGATCTAATGTTTAGGAGGCCTTCCTTTGACTTTTTATATTCGAGGC 

AACATCTTTTTTTAAAAATCATGAATTATAGTTTTTTAGCAAGTGACCGTTAT 

GTTCTGTTTTCCTGTCGGGATCGTTTGGAAGGAAGGAGAAAACTGGCTTTGCA 
GGTGTTACTGGCTGGCACAGTAC (SEQ ID NO:72) 

Line number 226 
>C1205 

GTACTGGACTTTGTTACAAAGCGACCCAGTTGACTGCTTTGACAGGAGCAGA 

TGAGAAGCATATCTTATGTTCTGTAAGAATGAGGCTCTGAATCCCTCTCTATA 

GAAAAGGTTACAGGCACTGTTCTTTATAACCAAAGAGCTTCCAGGAGACTGA 

ACGTCTTTGCTGTCCACTCTTGTTTTGTGTGAGTATATAATCTCACTCCAATCT 

GGTGCCATACTTCCCTTGGCTACCAAGCCACGTGCTGCCCTTGGTCCCTGTCC 

CCTTTCCTAAGCACACTGAGAAATCGCACAGCTGTAACCTTCAGTCTTCCACA 

TAGCCTGGTAC (SEQ ID NO:73) 

Line number 206 
>cl217 

AAACAATTATACTCCTGAATTCTCTTCTCTGATATCTTTTAATATTTTAATGTG 
CACCTTTTTCTCTATATAGACACACTCATATACAGTTGAAAAGTTAAGTCAGA 
CTCTTCTTTATAGCCTTTCTGCCACTTGTTTCCTGACACCAGCTTTTATGAAAC 
AGACATATTTTCAAGAGTTTTACTTCATTTTTATTAATAATACCACAGTATTCT 
ATCATTTACTATATTAGGATTCATTCAGTATTGAATATTTAGATTGTTTTAATT 

TTTACTATCTAAAAATATGTCAGACTCATCCTTGGGTGCAACCTAGTATGCTT 
G (SEQIDNO:74) 
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Adipocyte Subtractive Library Sequences 

Line Number 55 
>b0010 

ACCTTGCTCTGCATGCCTTTCTCCCCCGTGCTGTCCCCTTTGGAGAGGTCACC 
AGGAACGCTTGAGGCAGTCTCCTGTCACTTTGGAGTGTGTATGTGTAGCAGGT 
CCAACAGGAAAGAATAAGTATGGTGTTGGGCTCAGCAGTCAAGAGCACACCC 
TGCTCTTACAGAGGACCTGCGTGTGGTTCCAAACACCCACTTCGGGTGGCCA 
CAAACCTCCTGTAATTCCAACCCCA (SEQ ID NO:75) 

Line Number 47 
>b0037 

ACTTCCCTCCACTTGGGGACCACTCCACTATGTGAGCATTTTCCTTTATATTTT 
TTATGAAAGCTGATCTTCCCTCTATAAGATTCCATGTTCTTAATGTTTTATCGG 
TGCCAACAGACAGAGCCAATTTGCCAGACGGGTGGATAGAAAGGAAAGTGA 

CATGTCCCCTGTGTAGCTTTAAAGGTCTTCAGGCACTTCCATCTCTTCACATCC 
C (SEQIDNO.-76) 

Line Number 59 
>B0065 

ACATGTTAATGATTTATCAGTATGCCCCGAATTCGTAGTGCAGTTCTGATTCT 

CCGCATGCCCTCAGCTGTGGTCAGGTGACTTCCTGTCCCCTGGCAGCTCTGCT 

GAGTCCCTGTGTTTGAGCCTCCAGGGAGAAGGGTTTGGGCTGCATTCTTCTTA 

TCCCCATGCACAGAAACGCTCAGGGTCCCCACGTGCCTGTTGTCCTCCCCTCT 

AGTCCTTTGTTCCCTTTCTGAGCATGTGGCCCCTCCCCCGGCTGTGAGGGGCA 

CCCCTTCCTAAGAAAGTGTTAATGCAGGATCAAAGACCACCACTGCATGTGT 

AC (SEQ ID NO:77) 

Line Number 57 
>b0091 

ACTTCCTTGTGTCAAAGCAAGAACATTTCCCCACGTTAACCCAAACCTCCACC 
TCCTATCCTGAAACCCTGACCACAGGACTCACGGGAGTGTGCCGACAGCAGG 
TGCTATGATCACAAGATAATCAGAGATCCAGGAAGGAAAGGGAAGGGAGGC • 
TGAAGGAACAGGAAGCATGGGATAAGCAAGACAAAGTGTGATAGCTTTCAA 
AGAAAATGTTGAGTTTCTTTTGGGATGGCTATGCTGTTCTTCA (SEQ ID 
NO:78) 

Line Number 52 
>B0096 

TGGGAATAAATCAAAAATTGATCAGTCTGGTGCAGGAGCCAGGGAGCTCTGG 

CCTGCACATCCTTCTGTGCCATCTTAGATGGCAGCTACCATCAAGGTCAGGCA 

AAAAGCCTTAAGACAGCTCACTGAAAGTCTCGTGTAAGGGCATTTGGTTGGC 

GGTTTTCAAGGGCTCTCAGAACACCAGCTCCACGTGCAGGGGAAGACCATTT 

TCCAAATAAGCCATGAATGGGCCCTCTCATCTTACCCAGTCTCAAGCCCTATT 

TT (SEQ ID NO: 79) 
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Line Number 64 
>B0101 

ACCATATTTCAAGAAGATAGCAGAAATTCAATGAAGTCGATTTAAAAACCTG 

GGGCTATTTACCTGGTCTGAGGTATGCAAGCTGTCCAAAGGTGGACACGTCT 

AAATAAATTTACCATGTCAGTTATCGGCCAAATGTTTTAGCATTTTTTTAAGC 

AGGATTTTCTTGGCTGAGAAGACAAGAAGACATTTTAATATCCATTGTAGAA 

ACGAGAATGGGCTGGGCACTGATGGTGCACGCCTTTAATCCCAACACTTGGA 

AGGCAGAGGCAGGTGGATCTCTGTGAGTTGGAGTCTAGCCTGGTCTACAAAG 

TGAGTTCCAGAATAGCCAGGGCTACACAGAGAAACCCTGTGTTGGCCAAACC 

AAAGC (SEQIDNO:80) 

Line Number 63 
>b0117 

GATACTTCATATCTGTCTTCAAATATGTATTTCCTAGGTATCTTTTGCATTGTA 

ATTTTCAGTAAATATACTGAAGCTTGATTTCACATTCATTTCTAAACTTGAAC 

TGTTTTTAATTTTAAAAATAGGCACTTATACAGAGAAACTTATAAAAACAATT 

TCATACATACTGAGTTGTCTCAGCATTTCTTTAAAGTCATTTGAGCTTTCTAGT 

TTTGCTGAAACTGCATTCAATGAATCATTCAGAAAAACAAAACATATCAGAA 

GGAAAACTATCTTTTTAAAATTTTGTTCTATAACCCACCTTTTAAAAACTTTAT 

ACAAGTCTATATAATACTCTTTTTATTGTCTAGGAAGAAATTACCCTTCTACC 

ATTAGCTTTAACAT (SEQ ID NO:81) 

Line Number 53 
>b0129 

ACGAAGAGTCAGACAAAATTCCCCTCTAGCTCACCCATCCGTTGCTGAAATC 

TCTTTGGGAAGGTGGGGGAGATATAAGCCACATCTGCCCCTTCTCCTAAGGTC 

TCACTGACAAACTGAGGCAGGGTTCTGTCAGACTTTACTCCAAGGAACCAGT 

GAGTGTCATTAAAGCAGAGTGGTGTATAACAGGTGGTAGGGCGCCACTGGAT 

ACACAGATGAGAGTCTCCTGGTCCTTGACACCACTCCACAATCAGATAAAAT 

CAGTCAGCAAGGCCAGCTGGGACAACCCTTTTAAAAGGGGGACAGCTGACA 

CCTCCACCCCCCTCTGGGCTGGTGGTTGCTTGCTTCTGAGACAGTCACTCAAA 

TACTACCCCA (SEQ ID NO:82) 

Line Number 51 
>b0136 

ACATTAATGATTTATTAAAAGAAACAACTCCTTGTCCCACTCCACTGTGCTGC 
TTGTAATCTCCATACATGGCCTCCATTTTCAACTGTTTTCTTGGTCACAGAACT 
CCAAACAAACACATTTTTTTTTCCAGGTAAAAGCTGTTTTTAGTTTGTAGT 
(SEQroNO:83) 
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Line Number 54 
>b0154 

ACCTATGTCATGGAAGGGAACTCTGTTACAAGCTTGAGCATATTTCAAATAA 

CTGTGGTAGTGCTGAGAAATAGGCTTTGCATCTGGCTAAGAGAAAAGAGCCT 

TCATGGAGTATCACAGAAAACTTCCAATTCTAGGTTTTCTATATACTTTGTTA 

CCTTGTTCAAAGAGATTTAATTGCTTCTATAATCAAAGAAGAGAACATGGAA 

CCCCCTCACTTCTAGGAAGTAGATGTGGTAACATTATTCCAGAGAATCTGTCA 

CCTAAAACTGAAGTGTTTACAATCCCTTAAAGCTAACCGGCTCCCCAGTGTGT 

TATACAAGGCAAGCCAAGATAGCCTCACTGTCCCTGAACTGGAAGTGTCACA 

CAGCTCAG (SEQIDNO:84) 

Line Number 26 
>b0158 

ATACTCTTTAAATATATTTTCAGTTTCTGATTTAGAGACATTGTTCACCACTGG 
ACTAGACTAACCTTGGATGCCGGTCAGTGACGACTTTGTCACACTAAATAAG 
ATGGCTGTTCTAGGCCTAAGACACTGTCTTGAGGGTCAGTAAGTGACGAGTC 
AGATGAGTTAACTGTGGGGAAACTAGAACATTCGAGCCAAGAATTATTG 
(SEQ ID NO:85) 

Line Number 56 
>B0174 

ACATTTGAATCCTTGTCAAAACAACAAAACATTAATGAGTTACTTTAGAGTAT 
CTTTGTTGAAAATGGAATTTTCAATAACTGTTTCATAATGTTTTGTTTTATTAC 
TCCTTAAGACTGATACTATAAGATGAAGATAACATTACTTCAAAATAGGTTC 
AAAGCATTTATTTTACAACTTTATGTTTTTATAAATTTCAGAATAAAAAAAAA 
AAAGCTTG (SEQIDNO:86) 

Line Number 58 
>b0175 

ACTGTGGCCTCATTCCTCCTACCCTCCAAGGGGTGCGCTATGTGGATGGCGGC 

ATTTCAGACAACTTGCCACTTTATGAGCTGAAGAATACCATCACAGTGTCCCC 

ATTCTCAGGCGAGAGTGACATCTGCCCTCAGGACAGCTCCACCAACATCCAC 

GAGCTTCGCGTCACCAACACCAGCATCCAGTTCAACCTTCGCAATCTCTACCG 

CCTCTCGAAGGCTCTCTTCCCGCCAGAGCCCATGGTCCTCCGAGAGATGTGCA 

AACAGGGCTACAGAGATGGACTTCGATTCCTTAGGAGGAATGGCCTACTGAA 

(SEQ ID NO:87) 
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Line Number 48 
>b0188 

ACCAAACCCCATCAAATACATAAAAATGCCTATCTCCAAACACTGCATGTCT 

AGACATTTACCCTGAAGAAATACTAGCATTAGTGTGCAAAAAAAAAACAATG 

CATACTAGGTTGCACACAAGGATGAGTCTGACATATTTTTAGATAGTAAAAA 

TTAAAACAATCTAAATATTCAATACTGAATGAATCCTAATATAGTAAATGAT 

AGAATACTGTGGTATTATTAATAAAAATGAAGTAAAACTCTTGAAAATATGT 

CTGTTTCATAAAAGCTGGTGTCAGGAAACAAGTGGCTGAAAGGCTATTAAGA 

AGAATCTGACTTAACTTTTCAACTGTATATGAGTGTGTCTATATAGAGAAAAA 

GGTGCCCATTT (SEQ ID NO: 8 8) 

Line Number 60 
>b0217 

ACTAATTGCAGTTATAACCCAAAAAGACTTCAAAGACATGAAAGAGACTCTT 

GGAGATGACATTACCGTGAAAATGTATTCCCCATCCTGGCCTAACTTTGACTA 

TACTCTGGTGGTTATTTTTGTAATTGCTGTGTTCACTGTGGCCTTAGGAGGAT 

ACTGGAGTGGACTTATTGAATTGGAAAACATGAAGTCAGTGGAAGACGCCGA 

AGACAGAGAGACCAGAAAGGGAGAAGGACGATTACTTTACATTCAGTCCTCN 

CACAGTTGTTGTGTTCGTGGTCATCTGCTGTATAATGATTGTCTTACTGTATTT 

CTTCTACAAGTGGCTTGTGTTTGTTATGATAGCGATTTTCTG (SEQ ID NO:89) 

Line Number 62 
>B0237 

TTTTTGATCTTGTTTCCATTTATTAGTCTAGGAATATACAAAATTGTGACATGG 

CACAACAATCTCATGGAGAAAACAAAATACACAGGAGGGTAGCTTTCTCCCA 

ACCCCTCATAAAGGCACACACTTGAGGCTTGGTTTGGGGGGGTCCTCCTGTCC 

TGGACCCAGTGTGCTGCACAAAAAGCCAGGCACCTCTGTGAAAAGAAAAGA 

AGCCCTTCCTCACACCAAACTTGGGGTGTGGCTCAGGAGGATTTGCCCCAAA 

ACAACGCTAAACTGTGTCCATGGTGCCAACC (SEQ ID NO:90) 

Line Number 50 
>B0245 

GATACAAGCAAACTGACTTCTGAAATGGACTATCACACTCTCCACCTGCCGG 
GCCCTCTCAAGCTGAGGTGGCTTTTTGCATTTTGCTACTCCTGGAGGCCATAG 
GCCAATGGATAATCTATGTTTCTCACTGTCTGTGTTTCCACCACGTTCCTAGCT 
CCCTGGAGT (SEQIDNO:91) 

Line Number 61 
>b0274 

TTTTAATGGAAGTTTTTATTGAGCTGTATATACAATTTAAGCTATTAAAATTTA 
TACAATATTTACAAATTAAATAATCATCTGAAACTCTCCAGAAACTCCTGTAA 
CACAAAACCAAAAGAAGGAAGCAGCAGCTAGGCCTTGGCTCCCTAGGCCTG 
GCCTGGCCCGGCCCGGCCCAAAGCTTGTTCTAAACTGCGAAGTCCAGGT 
(SEQ ID NO:92) 



WO 02/33046 PCT/US01/49451 

24/77 

Figure 3E 

Line Number 49 
>b0284 

ACTTT.GCTTTTTATTATTCATTCAATAATATAAATATTGTGTCAGTTTCTACAG 

TCTCTTGACTTTAATGTATAGCTATCTATAGATCTGCTTCTCCTTATTTATGCT 

AGCTTTCAGGTTACTTCTGGCAACTCTGGCTTGCTTGCTGGTGAAGAGGCCCT 

GATTCCAACAAAAGCAGAAAGCAGCATTAGATTTTATTTTCCTTGTTCTAAAA 

TGCTTGGCTTTTGCTAGTCTAAAAATGATTGATTTTGATTAGAAATTTGAAAA 

TGACAAGGATTTCTGAGTGTATGTAAAACTAAACAATTTACCAATGTCATTCC 

ATTTACTAGTGTTAATACTGAAGACCTTAACAATTCCGTCCACAAAAGGTAAC 

TACAGGGGAA (SEQ1DN0:93) 
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Figure 8A 

Primary Adipocytes - PEAK 1 
Band Protein/Accession number 

1 AM-2 Receptor (49942), human (NP_002323) 

2 AM-2 Receptor (49942), human (NP JJ02323) 

3 AM-2 Receptor (49942), human (NP_002323) 

4 Non-Identified 

5 IGFII-Receptor/ Cation-independent Mannose 6-P Receptor (1 398935), human 
(NP 000867) 

6 Fatty Acid Synthase ( 66561), human (G01880); 
Acetyl-CoA Carboxylase (116670), human (NPJ300655) 

7 DNAJ-Domain containing protein (3327170), human (T00361) 

8 ABC1 -Protein (6005701), human (009099) 

9 IRAP (1674503), human (CAB61646) 

10 IRAP (1674503), human (CAB61646) 

1 1 IRAP (1674503), human (CAB61646) 

12 Non-Identified 

13 ATP Citrate-Lyase (113116), human (AAB60340) 

14 Amine Oxidase (4185817), human (NP.0037250) 

15 Amine Oxidase (4185817), human (NP 0037250) 

1 6 Alpha-2 Macroglobulin Receptor (4758686), human (NP J)02323); 
Hormone sensitive Lipase (1346458), human (Q05469) 

1 7 CD36 (3273897), human (NP J)00063) 

1 8 CD36 (3273897), human (NP_000063) 

19 Long Chain Acyl-CoA Synthetase (12601 1), human (P33121) 
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Figure 8B 

20 Long Chain Acyl-CoA Synthetase (126011), human (P33121) 

21 Phosphatidylserine-Binding Protein/Substrate of PKC (455719), human 
(NP_004648) 

22 EH-Domain-Containing Protein (1861774), human (1861774) 

23 Novel 

24 Carboxylesterase (1407780), human (AAB0361 1); 
Vimentin (401365), human (A25074) 

25 a-Tubulin (223556), human (A23035); 
GLUT4 (121763), human (NP_001033); 
DLAST (266684), human (AAD30181) 

26 GLUT4 (121763), human (NP_001033) 

27 GLUT4-P (121763), human (NP_001033) 

28 GLUT4 (121763), human (NP_001033); 
Non-Identified 

29 Junctional Adhesion Molecule (54571 19), human (AAD3794) 

30 SCAMP (5032077), human (NP_005689) 

3 1 Ribosomal Protein L6 (25073 1 5), human (Q02878) 

32 SCAMP (5032077), human (NP005689) 

33 Non-Identified 

34 Non-Identified 

35 29 kD-Golgi SNARE (3213227), human (NPJ306361); 
Non-Identified 

36 Non-Identified 

37 Caveolin-1 (1705645), human (AAD23745); 
Non-Identified 

38 Caveolin-1 (1705645), human (AAD23745); 
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Figure 8C 

39 Novel 

40 Novel 



3T3-L1 Adipocytes - PEAK 1 
Band Protein/Accession number 

1 AM-2 Receptor (4758686), human (CAA38905) 

Myeloid associeted differentiation protein (3212400), human (3212400) 

2 IGF H- Receptor (1 709091 ), human (NP_000867) 
Non-muscle myosin heavy chain A (967249), human (P35579) 

3 ot-1 subunity Na+/K+-ATPase (1 14374), human (P05023) 
SERCAla Ca2+-ATPase (477339), human (1586563) 

4 IRAP (2144020), human (CAB61 646) 

Leucine aminopeptidase placental (1888354), human (CAB94753) 
Protein 4.1 G (3064263), human (PI 1171) 
Glutamyl-prolyl-t-RNA (4758294), human (NP_004437) 
ES/130 (4759056), human (NP_004578) 

5 Coatomer protein a-subunit (6642754), human (NPJD04362) 

6 Coatomer protein a-subunit (6642754), human (NP 004362) 

7 Non-Identified 

8 Pyruvate Carboxylase (667923 7), human (NP_0009 1 1 ) 

9 Sortilin 1 (6653197), human (NP_002950) 
Myosin 1 heavy chain (480659), human (CAA67131) 
Proteasome subunit pi 12 (4506225), human (NP_002798) 
Major vault protein (497940), human (NP_005106) 

1 0 Sortilin 1 (45071 59), human (NP_002950) 
a-Catenin (2134736), human (139438) 
Protein C23 (128843), human (NP_005372) 
Major vault protein (497940), human (NP_005106) 
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Figure 8D 

1 1 Coatomer protein (52-subunit (4758032), human (NP_004757) 
AP2 pi-subunit (4557469), human (NP_001273) 

AP2 a2-subunit (6671563), human (AAD 15564) 

LIMP-II (126381), human (NP_002285) 

GRP94 (1 19362), human (AAP82792) 

Mix (3550456), human (NP_037506) 

Proteasome subunit p97 (1060888), human (BAA 1 1226) 

1 2 Amine oxidase (5902787), human (NP_003725) 

Long chain Acyl-CoA synthetase 2 (6679739), human (P33121) 

Geosolin (121 1 17), human (NP_000168) 

Calnexin (6671664), human (P27824) 

STA 1(6678 153), human (NP_009330) 

Hsp 90 (6680307), human (NP_031381) 

Glycogen synthase (5171 12), human (BAA06154) 



13 RET II (3551509), human (NP_005104) 

Hormone sensitive lipase (1708847), human (Q05469) 

1 4 Moesin (4505257), human (NP_002435) 

Long chain Acyl-CoA synthetase 2 (6679739), human (P33121) 
Calcium binding protein 2 (729436), human (NP_004902) 
Cytochrome P450 (6679421), human (PI 6435) 
GRP78 (121574), human (CAB71335) 

15 Protein 4.1B (5020274), human (NP_001422) 
S3- 12 protein (3236368), human (NP_001 1 13) 
pl-Integrin (124964), human (P05556) 

16 a6-Integrin (3183038), human (P23229) 
Vitronectin receptor (6680486), human (NP_002201) 
Oncoprotein LFC (6678666), human (Q92974) 
eIF3-pl 10 (4503525), human (AAC27674) 

17 Calcium binding protein 63K (2493471), human (NP_006175) 
Dihydrolipoamide acetyltransferase (226207), human (226207) 
Primary biliary cirrhosis autoantigen (21 1 7706), human (PI 05 1 5) 
Complement C3 (1352102), human (NP_000055) 
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Figure 8E 

18 Vimentin (281012), human (A25074) 
Carboxylesterase 1 (6679689), human (NP_036254) 
Lipoprotein lipase (6678710), human (NP_000228) 
PTRF (6679567), human (AAC63404) 
Prolyl-4-hidroxylase p-subunit (2507460), human (P07237) 
ER-60 protease (927670), human (P30101) 

19 Dynein light intermediate chain 53/55 (2618478), human (NPJ)06132) 
Calcium binding protein 1 (2501206), human (NP_005733) 
Dihydrolipoamide acetyltransferase (1710279), human (AAB50223) 
DEBT-91 (4838557), human (AAF67009) 

a-subunit ATP-synthase (6680748), human (NP_004037) 



20 GLUT4 (6678015), human (NP_001033) 
Actin a-2(4501883), human (NPJ)01604) 
Ribosomal protein L3 (4506649), human (NP _000958) 

21 SCAMP (3914963), human (014828) 
GLUT4 (6678015), human (NPJXH033) 

22 Annexin V(1351942), human (999924) 

Glyceraldehyde 3-phosphate dehydrogenase^ 76768), human (NP_002037) 

TAX (1350763), human (Q02878) 

Ribosomal protein L5 (1 173056), human (NP_000960) 

23 Proteasome C2 subunit (5757653), human (NP_002777) 
Proteasome 26S subunit PP31 (4506233), human (NP_002803) 
Ribosomal protein L7A (4506661), human (NPJ300963) 
Ribosomal protein S3 (200770), human (AAA03081) 

Heme oxigenase 1 (123447), human (NPJ)02124) 

24 BAP-31 protein (2137162), human (NP_005736) 
14-3-3 protein (4507953), human (NP_003397) 
Prohibitin (4505773), human (NP_002625) 
Micropain subunit IOTA (296736), human (CAA43964) 
Proteasome subunit zeta chain (4506187), human (NPJ)02781) 
C6-1 Proteasome PSMA7 chain (3805978), human (NP_002783) 
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Figure 8F 

25 v-SNARE Vtil-b (3213229), human (NP_006361) 
SNAP-23/Syndet (6678049), human (000161) 
BAP-31 protein (213712), human (NP_005736) 
Ubiquitin precursor (21 18964), human (152220) 
Proteasome subunit zeta chain (4506187), human (NP_002781) 
C6-1 Proteasome PSMA7 chain (3805978), human (NP_002783) 
Ribosomal protein L13 (1083788), human (NP_000968) 

26 Synaptobrevin-like protein (1617398), human (NP_005629) 
Ferritin light chain (120524), human (P02792) 
Ribosomal protein (6677781), human (NP_000983) 

27 BET-3 (2791806), human (NP_055223) 

Myosin regulatory light chain (NP_059039), human (AAA67367) 
Type II membrane protein (), human () 

Polyposis locus protein 1 homolog (Q60870), human (Q00765) 
Cytochrome B5 (1 17809), human (P00167) 

28 ATP-Synthase, 5-subunit (4502297), human (NP_001 678) 
Cytochrome B5 (117809), human (POO 167) 

Mytochondrial import receptor subunit TOM20 (298698), human (NP_055580) 
Ribosomal protein 40S (4506695), human (NP_00 101 3) 

29 VAMP3 (6678553), human (NP_004772) 
Cystatin C (1345935), human (AAA52164) 
Histone Hlb (356168), human (NP_005312) 

30 Non-muscle Myosin light chain 6 (127148), human (P24572) 
Cytochrome B5 (231928), human (P00167) 

Ribosomal protein 40S (4506695), human (NP_001013) 

3 1 Annexin V(6753060), human (999924) 
Lipocortin V(2981437), human (999924) 

Membrane glicoprotein GP42 (114835), human (NP_001719) 
Ribosomal protein S6 (225901), human (P10660) 

32 Glyceraldehyde 3-phosphate dehydrogenase^ 20707), human (NP_002037) 
Calpactin 1 (1 13951), human (NP_004030) 

Ribosomal protein L5 (1 173056), human (NP 000960) 

33 p-Tubulin (135451), human (T08726) 
IMPDH II (124427), human (PI 2268) 

Vacuolar H+- ATPase (522 1 93), human (AAA58 66 1 ) 
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Figure 8G 

34 a- 1 subunity Na+/K+- ATPase (358959), human (P05023) 
Proteasome 26S p97 (1060888), human (BAA1 1226) 
Mix (3550456), human (NP_037506) 

Gelsolin (121 1 17), human (NP_000168) 
p-Catenin (4503131), human (NP_001895) 

35 AM -2 precursor (4557225), human (NP_000005) 

Ca2+- ATPase 1 -plasma membrane (4502287), human (NP_001673) 
a-1 subunity Na+/K+-ATPase (358959), human (P05023) 



3T3-L1 Adipocytes - PEAK 2 
Band Protein/Accession number 



1 Dynein heavy chain (294543), human (Q14204) 
Plectin (1709655), human (CAA91 196) 

2 MAP4 (7106363), human (NP 002366) 
HsGCNl (3970973), human (AAC83183) 
Telomerase protein 1 (6678285), human (NP_009041) 

3 Myosin heavy chain A (698 1 236), human (P35579) 
MAP4 (7106363), human (NP_002366) 

4 IQ-motif containing GTPase activatin protein (4506787),human (NP_003861) 
MAP4 (7106363), human (NP_002366) 

5 Ribosomal binding protein ES/130 (4759056), human (NP_004578) 



6 Kinesin-related protein (2370435), human (NP_006603) 
Isoleucine t-RNA synthetase (4504555), human (NP_002152) 

7 Pyruvate carboxylase (6679237), human (NP_00091 1) 
Serine/arginine-rich protein specific kinase (6678135),human (AAC29140) 
Ribonucleoprotein U (4758546), human (NP_004492) 



8 Glycogen synthase (6680141), human (PI 3807) 
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Figure 8H 

9 Hsp70 protein (5729877), human (NPJ)06588) 

Long chain fatty acid Acyl-CoA synthetase ( 1 260 1 1 ),human (P33 1 2 1 ) 

1 0 Hsp70 protein (497940), human (NP_006588) 
GRP78 (121567), human (PI 1021) 
Estrogen-responsive finger protein (2137285), 
human (NP_005073) 

Major vault protein (497940), human (NP_005106) 

1 1 Vimentin (138536), human (A25074) 
Pl-Phospholipase I (91897), human (JC5704) 
DJ149A16.6 (Novel Protein) (4468866), human (HSPC117) 

12 26S Protease regulatory subunit 6A (2492523), human (PI 7980) 
Human elongation factor 1 -alpha (4503471), human (NPJ301393) 
Ribosomal protein L4 (1710511), human (JBAA04887) 

13 p38-2G4, Cell cycle protein (1083448), human (AAD05561) 
Ribosomal protein L3 (7305441), human (NP_000958) 
TAX responsive enhancer (6755354), human (Q02878) 

14 Nucleolar phosphoprotein (7242160), human (P06748) 
Endothelial monocyte-activating protein II (6679639), human (B55053) 

15 Pyruvate dehydrogenase (2144337), human (DEHUPB) 
Ribosomal protein L6 (25073 1 5), human (Q02878) 
Ribosomal protein L5 (1 173056), human (NP_000960) 

1 6 Ribosomal protein L6 (6755354), human (Q02878) 
Ribosomal protein S4 (227229), human (NP J)00998) 

1 7 Ribosomal protein 1 1 OA (6755350), human (CAB38627) 
Ribosomal protein S8 (4506743), human (NP_001003) 
BAP-31 (2137162), human (S49265) 

Ribosomal protein L13 (1350662), human (NP_000968) 

18 Ribosomal protein SI 4 (133785), human (NP_005608) 
Ribosomal protein SI 9 (4506695), human (NP_001013) 
Ribosomal protein S26 (6981488), human (P02383) 
Ribosomal protein L31 (4506633), human (NP_000984) 
Ribosomal protein S25 (4506707), human (NP_001019) 
Cytochrome B5 (231928), human (P00167) 

H2B histone family member (4504275), human (NPJJ03518) 
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Figure 81 

19 HistoneH 1.4 (1170151), human (NP_005312) 
Ribosomal protein S2 (4506719), human (PI 5880) 

20 Aspartate t-RNA ligase (1 35099),human (NP_001 340) 

Dynein light intermediate chain 2 (2494218), human (NP_006132) 
26S Protease regulatory subunit 6A (2492523), human (PI 7980) 
Tubulin (3-5 (7106439),human (NP_006078) 
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Fig. 10D Sweec/Eisenberg hydrophobic ity: Window = 11 
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